summaryrefslogtreecommitdiff
path: root/src/stdio
AgeCommit message (Collapse)AuthorFilesLines
2017-04-22remove va_arg hacks in printf core with undefined behaviorRich Felker1-26/+1
the code being removed was written to optimize for size assuming the compiler cannot collapse code paths for different types with the same underlying representation. modern compilers sometimes succeed in making this optimization themselves, but either way it's a small size difference and not worth the source-level complexity or the UB involved in this hack. some incorrect use of va_arg still remains, particularly use of void * where the actual argument has a different pointer type. fixing this requires some actual code additions, rather than just removing cruft, so I'm leaving it to be done later as a separate commit.
2017-03-14fix wide scanf's use of a compound literal past its lifetimeRich Felker1-1/+2
2016-11-07fix swprintf internal buffer state and error handlingRich Felker1-1/+8
the swprintf write callback never reset its buffer pointers, so after its 256-byte buffer filled up, it would keep repeating those bytes over and over in the output until the destination buffer filled up. it also failed to set the error indicator for the stream on EILSEQ, potentially allowing output to continue after the error.
2016-10-21redesign snprintf without undefined behaviorRich Felker1-25/+38
the old snprintf design setup the FILE buffer pointers to point directly into the destination buffer; if n was actually larger than the buffer size, the pointer arithmetic to compute the buffer end pointer was undefined. this affected sprintf, which is implemented in terms of snprintf, as well as some unusual but valid direct uses of snprintf. instead, setup the FILE as unbuffered and have its write function memcpy to the destination. the printf core sets up its own temporary buffer for unbuffered streams.
2016-10-20fix float formatting of some exact halfway casesSzabolcs Nagy1-1/+2
in nearest rounding mode exact halfway cases were not following the round to even rule if the rounding happened at a base 1000000000 digit boundary of the internal representation and the previous digit was odd. e.g. printf("%.0f", 1.5) printed 1 instead of 2.
2016-10-20fix integer overflows and uncaught EOVERFLOW in printf coreRich Felker2-46/+89
this patch fixes a large number of missed internal signed-overflow checks and errors in determining when the return value (output length) would exceed INT_MAX, which should result in EOVERFLOW. some of the issues fixed were reported by Alexander Cherepanov; others were found in subsequent review of the code. aside from the signed overflows being undefined behavior, the following specific bugs were found to exist in practice: - overflows computing length of floating point formats with huge explicit precisions, integer formats with prefix characters and huge explicit precisions, or string arguments or format strings longer than INT_MAX, resulted in wrong return value and wrong %n results. - literal width and precision values outside the range of int were misinterpreted, yielding wrong behavior in at least one well-defined case: string formats with precision greater than INT_MAX were sometimes truncated. - in cases where EOVERFLOW is produced, incorrect values could be written for %n specifiers past the point of exceeding INT_MAX. in addition to fixing these bugs, we now stop producing output immediately when output length would exceed INT_MAX, rather than continuing and returning an error only at the end.
2016-10-19fix integer overflow in float printf needed-precision computationRich Felker1-1/+1
if the requested precision is close to INT_MAX, adding LDBL_MANT_DIG/3+8 overflows. in practice the resulting undefined behavior manifests as a large negative result, which is then used to compute the new end pointer (z) with a wildly out-of-bounds value (more overflow, more undefined behavior). the end result is at least incorrect output and character count (return value); worse things do not seem to happen, but detailed analysis has not been done. this patch fixes the overflow by performing the intermediate computation as unsigned; after division by 9, the final result necessarily fits in int.
2016-09-18simplify/refactor fflush and make fflush_unlocked an alias for fflushRich Felker1-30/+23
previously, fflush_unlocked was an alias for an internal backend that was called by fflush, either for its argument or in a loop for each file if a null pointer was passed. since the logic for the latter was in the main fflush function, fflush_unlocked crashed when passed a null pointer, rather than flushing all open files. since fflush_unlocked is not a standard function and has no specification, it's not clear whether it should be expected to accept null pointers like fflush does, but a reasonable argument could be made that it should. this patch eliminates the helper function, simplifying fflush, and makes fflush_unlocked an alias for fflush, which is valid because the two functions agree in their behavior in all cases where their behavior is defined (the unlocked version has undefined behavior if another thread could hold locks).
2016-09-16fix printf regression with alt-form octal, zero flag, and field widthRich Felker1-1/+1
commit b91cdbe2bc8b626aa04dc6e3e84345accf34e4b1, in fixing another issue, changed the logic for how alt-form octal adds the leading zero to adjust the precision rather than using a prefix character. this wrongly suppressed the zero flag by mimicing an explicit precision given by the format string. switch back to using a prefix character. based on bug report and patch by Dmitry V. Levin, but simplified.
2016-04-26fix FILE buffer underflow in ungetwcRich Felker1-3/+3
commit 7e816a6487932cbb3cb71d94b609e50e81f4e5bf (version 1.1.11 release cycle) moved the code that performs wchar_t to multibyte conversion across code that used the resulting length in bytes, thereby breaking the unget buffer space check in ungetwc and clobbering up to three bytes below the start of the buffer. for allocated FILEs (all read-enabled FILEs except stdin), the underflow clobbers at most the FILE-specific locale pointer. no stores are performed through this pointer, but subsequent loads may result in a crash or mismatching encoding rule (UTF-8 multibyte vs byte-based). for stdin, the buffer lies in .bss and the underflow may clobber another object. in practice, for libc.so the adjacent object seems to be stderr's buffer, which is completely unused, but this could vary with linking options, or when static linking. applications which do not attempt to use more than one character of ungetwc pushback, or which do not use ungetwc, are not affected.
2016-03-28fix undefined pointer comparison in stdio-internal __toreadRich Felker1-1/+1
the comparison f->wpos > f->buf has undefined behavior when f->wpos is a null pointer, despite the intuition (and actual compiler behavior, for all known compilers) being that NULL > ptr is false for all valid pointers ptr. the purpose of the comparison is to determine if the write buffer is non-empty, and the idiom used elsewhere for that is comparison against f->wbase, which is either a null pointer when not writing, or equal to f->buf when writing. in the former case, both f->wpos and f->wbase are null; in the latter they are both non-null and point into the same array.
2016-03-16fix padding string formats to width in wide printf variantsRich Felker1-4/+4
the idiom fprintf(f, "%.*s", n, "") was wrongly used in vfwprintf as a means of producing n spaces; instead it produces no output. the correct form is fprintf(f, "%*s", n, ""), using width instead of precision, since for %s the later is a maximum rather than a minimum.
2016-02-16fix assumption in fputs that fwrite returning 0 implies an errorRich Felker1-1/+2
internally, the idiom of passing nmemb=1 to fwrite and interpreting the return value of fwrite (which is necessarily 0 or 1) as failure/success is fairly widely used. this is not correct, however, when the size argument is unknown and may be zero, since C requires fwrite to return 0 in that special case. previously fwrite always returned nmemb on success, but this was changed for conformance with ISO C by commit 500c6886c654fd45e4926990fee2c61d816be197.
2016-02-10fix return value for fread/fwrite when size argument is 0Rich Felker2-0/+2
when the size argument was zero but nmemb was nonzero, these functions were returning nmemb, despite no data having been written. conceptually this is not wrong, but the standard requires a return value of zero in this case.
2016-02-10fix failed write reporting by fwrite in line-buffered modeRich Felker1-2/+2
when a write error occurred while flushing output due to a newline, fwrite falsely reported all bytes up to and including the newline as successfully written. in general, due to buffering such "spurious success" returns are acceptable for stdio; however for line-buffered mode it was subtly wrong. errors were still visible via ferror() or as a short-write return if there was more data past the newline that should have been written, but since the contract for line-buffered mode is that everything up through the newline be written out immediately, a discrepency was observable in the actual file contents.
2015-12-20fix overly pessimistic realloc strategy in getdelimRich Felker1-1/+1
previously, getdelim was allocating twice the space needed every time it expanded its buffer to implement exponential buffer growth (in order to avoid quadratic run time). however, this doubling was performed even when the final buffer length needed was already known, which is the common case that occurs whenever the delimiter is in the FILE's buffer. this patch makes two changes to remedy the situation: 1. over-allocation is no longer performed if the delimiter has already been found when realloc is needed. 2. growth factor is reduced from 2x to 1.5x to reduce the relative excess allocation in cases where the delimiter is not initially in the buffer, including unbuffered streams. in theory these changes could lead to quadratic time if the same buffer is reused to process a sequence of lines successively increasing in length, but once this length exceeds the stdio buffer size, the delimiter will not be found in the buffer right away and exponential growth will still kick in.
2015-12-19avoid updating caller's size when getdelim fails to reallocRich Felker1-5/+6
getdelim was updating *n, the caller's stored buffer size, before calling realloc. if getdelim then failed due to realloc failure, the caller would see in *n a value larger than the actual size of the allocated block, and use of that value is unsafe. in particular, passing it again to getdelim is unsafe. now, temporary storage is used for the desired new size, and *n is not written until realloc succeeds.
2015-10-24fix single-byte overflow of malloc'd buffer in getdelimRich Felker1-1/+1
the buffer enlargement logic here accounted for the terminating null byte, but not for the possibility of hitting the delimiter in the buffer-refill code path that uses getc_unlocked, in which case two additional bytes (the delimiter and the null termination) are written without another chance to enlarge the buffer. this patch and the corresponding bug report are by Felix Janda.
2015-10-08fix open_[w]memstream behavior when no writes take placeRich Felker2-4/+18
the specification for these functions requires that the buffer/size exposed to the caller be valid after any successful call to fflush or fclose on the stream. the implementation's approach is to update them only at flush time, but that misses the case where fflush or fclose is called without any writes having taken place, in which case the write flushing callback will not be called. to fix both the observable bug and the desired invariant, setup empty buffers at open time and fail the open operation if no memory is available.
2015-09-09fix fclose of permanent (stdin/out/err) streamsRich Felker1-2/+3
this fixes a bug reported by Nuno Gonçalves. previously, calling fclose on stdin or stdout resulted in deadlock at exit time, since __stdio_exit attempts to lock these streams to flush/seek them, and has no easy way of knowing that they were closed. conceptually, leaving a FILE stream locked on fclose is valid since, in the abstract machine, it ceases to exist. but to satisfy the implementation-internal assumption in __stdio_exit that it can access these streams unconditionally, we need to unlock them. it's also necessary that fclose leaves permanent streams in a state where __stdio_exit will not attempt any further operations on them. fortunately, the call to fflush already yields this property.
2015-08-09fix failure of tempnam to null-terminate resultRich Felker1-0/+1
tempnam uses an uninitialized buffer which is filled using memcpy and __randname. It is therefore necessary to explicitly null-terminate it. based on patch by Felix Janda.
2015-06-16refactor stdio open file list handling, move it out of global libc structRich Felker9-36/+37
functions which open in-memory FILE stream variants all shared a tail with __fdopen, adding the FILE structure to stdio's open file list. replacing this common tail with a function call reduces code size and duplication of logic. the list is also partially encapsulated now. function signatures were chosen to facilitate tail call optimization and reduce the need for additional accessor functions. with these changes, static linked programs that do not use stdio no longer have an open file list at all.
2015-06-16byte-based C locale, phase 2: stdio and iconv (multibyte callers)Rich Felker5-8/+33
this patch adjusts libc components which use the multibyte functions internally, and which depend on them operating in a particular encoding, to make the appropriate locale changes before calling them and restore the calling thread's locale afterwards. activating the byte-based C locale without these changes would cause regressions in stdio and iconv. in the case of iconv, the current implementation was simply using the multibyte functions as UTF-8 conversions. setting a multibyte UTF-8 locale for the duration of the iconv operation allows the code to continue working. in the case of stdio, POSIX requires that FILE streams have an encoding rule bound at the time of setting wide orientation. as long as all locales, including the C locale, used the same encoding, treating high bytes as UTF-8, there was no need to store an encoding rule as part of the stream's state. a new locale field in the FILE structure points to the locale that should be made active during fgetwc/fputwc/ungetwc on the stream. it cannot point to the locale active at the time the stream becomes oriented, because this locale could be mutable (the global locale) or could be destroyed (locale_t objects produced by newlocale) before the stream is closed. instead, a pointer to the static C or C.UTF-8 locale object added in commit commit aeeac9ca5490d7d90fe061ab72da446c01ddf746 is used. this is valid since categories other than LC_CTYPE will not affect these functions.
2015-06-13remove cancellation points in stdioRich Felker3-24/+3
commit 58165923890865a6ac042fafce13f440ee986fd9 added these optional cancellation points on the basis that cancellable stdio could be useful, to unblock threads stuck on stdio operations that will never complete. however, the only way to ensure that cancellation can achieve this is to violate the rules for side effects when cancellation is acted upon, discarding knowledge of any partial data transfer already completed. our implementation exhibited this behavior and was thus non-conforming. in addition to improving correctness, removing these cancellation points moderately reduces code size, and should significantly improve performance on i386, where sysenter/syscall instructions can be used instead of "int $128" for non-cancellable syscalls.
2015-06-13fix idiom for setting stdio stream orientation to wideRich Felker6-6/+6
the old idiom, f->mode |= f->mode+1, was adapted from the idiom for setting byte orientation, f->mode |= f->mode-1, but the adaptation was incorrect. unless the stream was alreasdy set byte-oriented, this code incremented f->mode each time it was executed, which would eventually lead to overflow. it could be fixed by changing it to f->mode |= 1, but upcoming changes will require slightly more work at the time of wide orientation, so it makes sense to just call fwide. as an optimization in the single-character functions, fwide is only called if the stream is not already wide-oriented.
2015-06-13add printing of null %s arguments as "(null)" in wide printfRich Felker1-0/+1
this is undefined, but supported in our implementation of the normal printf, so for consistency the wide variant should support it too.
2015-06-13add %m support to wide printfRich Felker1-0/+2
2015-06-06remove another invalid skip of locking in ungetwcRich Felker1-3/+1
2015-06-06remove invalid skip of locking in ungetwcRich Felker1-6/+3
aside from being invalid, the early check only optimized the error case, and likely pessimized the common case by separating the two branches on isascii(c) at opposite ends of the function.
2015-05-29fix failure of ungetc and ungetwc to work on files in eof statusRich Felker5-10/+11
these functions were written to handle clearing eof status, but failed to account for the __toread function's handling of eof. with this patch applied, __toread still returns EOF when the file is in eof status, so that read operations will fail, but it also sets up valid buffer pointers for read mode, which are set to the end of the buffer rather than the beginning in order to make the whole buffer available to ungetc/ungetwc. minor changes to __uflow were needed since it's now possible to have non-zero buffer pointers while in eof status. as made, these changes remove a 'fast path' bypassing the function call to __toread, which could be reintroduced with slightly different logic, but since ordinary files have a syscall in f->read, optimizing the code path does not seem worthwhile. the __stdio_read function is also updated not to zero the read buffer pointers on eof/error. while not necessary for correctness, this change avoids the overhead of calling __toread in ungetc after reaching eof, and it also reduces code size and increases consistency with the fmemopen read operation which does not zero the pointers.
2015-04-04fix getdelim to set the error indicator on all failuresSzabolcs Nagy1-2/+5
2015-02-23fix possible isatty false positives and unwanted device state changesRich Felker2-6/+4
the equivalent checks for newly opened stdio output streams, used to determine buffering mode, are also fixed. on most archs, the TCGETS ioctl command shares a value with SNDCTL_TMR_TIMEBASE, part of the OSS sound API which was apparently used with certain MIDI and timer devices. for file descriptors referring to such a device, TCGETS will not fail with ENOTTY as expected; it may produce a different error, or may succeed, and if it succeeds it changes the mode of the device. while it's unlikely that such devices are in use, this is in principle very harmful behavior for an operation which is supposed to do nothing but query whether the fd refers to a tty. TIOCGWINSZ, used to query logical window size for a terminal, was chosen as an alternate ioctl to perform the isatty check. it does not share a value with any other ioctl commands, and it succeeds on any tty device. this change also cleans up strace output to be less ugly and misleading.
2015-02-13overhaul aio implementation for correctnessRich Felker1-1/+8
previously, aio operations were not tracked by file descriptor; each operation was completely independent. this resulted in non-conforming behavior for non-seekable/append-mode writes (which are required to be ordered) and made it impossible to implement aio_cancel, which in turn made closing file descriptors with outstanding aio operations unsafe. the new implementation is significantly heavier (roughly twice the size, and seems to be slightly slower) and presently aims mainly at correctness, not performance. most of the public interfaces have been moved into a single file, aio.c, because there is little benefit to be had from splitting them. whenever any aio functions are used, aio_cancel and the internal queue lifetime management and fd-to-queue mapping code must be linked, and these functions make up the bulk of the code size. the close function's interaction with aio is implemented with weak alias magic, to avoid pulling in heavy aio cancellation code in programs that don't use aio, and the expensive cancellation path (which includes signal blocking) is optimized out when there are no active aio queues.
2014-12-18don't suppress sign output for NANs in printfRich Felker1-1/+1
formally, it seems a sign is only required when the '+' modifier appears in the format specifier, in which case either '+' or '-' must be present in the output. but the specification is written such that an optional negative sign is part of the output format anyway, and the simplest approach to fixing the problem is removing the code that was suppressing the sign.
2014-12-17correctly handle write errors encountered by printf-family functionsRich Felker2-2/+12
previously, write errors neither stopped further output attempts nor caused the function to return an error to the caller. this could result in silent loss of output, possibly in the middle of output in the event of a non-permanent error. the simplest solution is temporarily clearing the error flag for the target stream, then suppressing further output when the error flag is set and checking/restoring it at the end of the operation to determine the correct return value. since the wide version of the code internally calls the narrow fprintf to perform some of its underlying operations, initial clearing of the error flag is suppressed when performing a narrow vfprintf on a wide-oriented stream. this is not a problem since the behavior of narrow operations on wide-oriented streams is undefined.
2014-11-15fix behavior of printf with alt-form octal, zero precision, zero valueRich Felker1-1/+1
in this case there are two conflicting rules in play: that an explicit precision of zero with the value zero produces no output, and that the '#' modifier for octal increases the precision sufficiently to yield a leading zero. ISO C (7.19.6.1 paragraph 6 in C99+TC3) includes a parenthetical remark to clarify that the precision-increasing behavior takes precedence, but the corresponding text in POSIX off of which I based the implementation is missing this remark. this issue was covered in WG14 DR#151.
2014-09-19fix linked list corruption in flockfile listsRich Felker1-0/+1
commit 5345c9b884e7c4e73eb2c8bb83b8d0df20f95afb added a linked list to track the FILE streams currently locked (via flockfile) by a thread. due to a failure to fully link newly added members, removal from the list could leave behind references which could later result in writes to already-freed memory and possibly other memory corruption. implicit stdio locking was unaffected; the list is only used in conjunction with explicit flockfile locking. this bug was not present in any releases; it was introduced and fixed during the same release cycle. patch by Timo Teräs, who discovered and tracked down the bug.
2014-09-04fix multiple stdio functions' behavior on zero-length operationsRich Felker4-9/+7
previously, fgets, fputs, fread, and fwrite completely omitted locking and access to the FILE object when their arguments yielded a zero length read or write operation independent of the FILE state. this optimization was invalid; it wrongly skipped marking the stream as byte-oriented (a C conformance bug) and exposed observably missing synchronization (a POSIX conformance bug) where one of these functions could wrongly complete despite another thread provably holding the lock.
2014-09-04suppress null termination when fgets reads EOF with no dataRich Felker1-1/+1
the C standard requires that "the contents of the array remain unchanged" in this case. this patch also changes the behavior on read errors, but in that case "the array contents are indeterminate", so the application cannot inspect them anyway.
2014-08-23fix false ownership of stdio FILEs due to tid reuseRich Felker3-2/+36
this is analogous commit fffc5cda10e0c5c910b40f7be0d4fa4e15bb3f48 which fixed the corresponding issue for mutexes. the robust list can't be used here because the locks do not share a common layout with mutexes. at some point it may make sense to simply incorporate a mutex object into the FILE structure and use it, but that would be a much more invasive change, and it doesn't mesh well with the current design that uses a simpler code path for internal locking and pulls in the recursive-mutex-like code when the flockfile API is used explicitly.
2014-07-16work around constant folding bug 61144 in gcc 4.9.0 and 4.9.1Rich Felker5-5/+5
previously we detected this bug in configure and issued advice for a workaround, but this turned out not to work. since then gcc 4.9.0 has appeared in several distributions, and now 4.9.1 has been released without a fix despite this being a wrong code generation bug which is supposed to be a release-blocker, per gcc policy. since the scope of the bug seems to affect only data objects (rather than functions) whose definitions are overridable, and there are only a very small number of these in musl, I am just changing them from const to volatile for the time being. simply removing the const would be sufficient to make gcc 4.9.1 work (the non-const case was inadvertently fixed as part of another change in gcc), and this would also be sufficient with 4.9.0 if we forced -O0 on the affected files or on the whole build. however it's cleaner to just remove all the broken compiler detection and use volatile, which will ensure that they are never constant-folded. the quality of a non-broken compiler's output should not be affected except for the fact that these objects are no longer const and thus possibly add a few bytes to data/bss. this change can be reconsidered and possibly reverted at some point in the future when the broken gcc versions are no longer relevant.
2014-07-16simplify __stdio_exit static linking logicRich Felker3-11/+8
the purpose of this logic is to avoid linking __stdio_exit unless any stdio reads (which might require repositioning the file offset at exit time) or writes (which might require flushing at exit time) could have been performed. previously, exit called two wrapper functions for __stdio_exit named __flush_on_exit and __seek_on_exit. both of these functions actually performed both tasks (seek and flushing) by calling the underlying __stdio_exit. in order to avoid doing this twice, an overridable data object __towrite_used was used to cause __seek_on_exit to act as a nop when __towrite was linked. now, exit only makes one call, directly to __stdio_exit. this is satisfiable by a weak dummy definition in exit.c, but the real definition is pulled in by either __toread.c or __towrite.c through their referencing a symbol which is defined only in __stdio_exit.c.
2014-07-02fix failure of wide printf/scanf functions to set wide orientationRich Felker2-0/+3
in some cases, these functions internally call a byte-based input or output function before calling getwc/putwc, so they cannot rely on the latter to set the orientation.
2014-07-01fix incorrect return value for fwide functionRich Felker1-1/+2
when the orientation of the stream was already set, fwide was incorrectly returning its argument (the requested orientation) rather than the actual orientation of the stream.
2014-06-10replace all remaining internal uses of pthread_self with __pthread_selfRich Felker1-1/+1
prior to version 1.1.0, the difference between pthread_self (the public function) and __pthread_self (the internal macro or inline function) was that the former would lazily initialize the thread pointer if it was not already initialized, whereas the latter would crash in this case. since lazy initialization is no longer supported, use of pthread_self no longer makes sense; it simply generates larger, slower code.
2014-06-06add O_CLOEXEC fallback for open and related functionsRich Felker2-0/+3
since there is no easy way to detect whether open honored or ignored the O_CLOEXEC flag, the optimal solution to providing a fallback is simply to make the fcntl syscall to set the close-on-exec flag immediately after open returns.
2014-06-06fix fd leak in tmpfile when the fdopen operation failsRich Felker1-1/+2
this condition could only happen due to malloc failure. the fdopen operation is also moved to take place after the unlink to minimize the window during which a link to the file exists in the directory table.
2014-06-04simplify vasprintf implementationRich Felker1-14/+1
the old implementation preallocated a buffer in order to try to avoid calling vsnprintf more than once. not only did this potentially lead to memory fragmentation from trimming with realloc; it also pulled in realloc/free, which otherwise might not be needed in a static linked program.
2014-05-30use cleaner code for handling float rounding in vfprintfSzabolcs Nagy1-3/+1
CONCAT(0x1p,LDBL_MANT_DIG) is not safe outside of libc, use 2/LDBL_EPSILON instead. fix was proposed by Morten Welinder.
2014-05-29support linux kernel apis (new archs) with old syscalls removedRich Felker5-2/+31
such archs are expected to omit definitions of the SYS_* macros for syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the preprocessor is then able to select the an appropriate implementation for affected functions. two basic strategies are used on a case-by-case basis: where the old syscalls correspond to deprecated library-level functions, the deprecated functions have been converted to wrappers for the modern function, and the modern function has fallback code (omitted at the preprocessor level on new archs) to make use of the old syscalls if the new syscall fails with ENOSYS. this also improves functionality on older kernels and eliminates the incentive to program with deprecated library-level functions for the sake of compatibility with older kernels. in other situations where the old syscalls correspond to library-level functions which are not deprecated but merely lack some new features, such as the *at functions, the old syscalls are still used on archs which support them. this may change at some point in the future if or when fallback code is added to the new functions to make them usable (possibly with reduced functionality) on old kernels.