Age | Commit message (Collapse) | Author | Files | Lines |
|
commit b114190b29417fff6f701eea3a3b3b6030338280 introduced spurious
realloc of the output buffer in cases where the result would exactly
fit in the caller-provided buffer. this is contrary to a strict
reading of the spec, which only allows realloc when the provided
buffer is "of insufficient size".
revert the adjustment of the realloc threshold, and instead push the
byte read by getc_unlocked (for which the adjustment was made) back
into the stdio buffer if it does not fit in the output buffer, to be
read in the next loop iteration.
in order not to leave a pushed-back byte in the stdio buffer if
realloc fails (which would violate the invariant that logical FILE
position and underlying open file description offset match for
unbuffered FILEs), the OOM code path must be changed. it would suffice
move just one byte in this case, but from a QoI perspective, in the
event of ENOMEM the entire output buffer (up to the allocated length
reported via *n) should contain bytes read from the FILE stream.
otherwise the caller has no way to distinguish trunated data from
uninitialized buffer space.
the SIZE_MAX/2 check is removed since the sum of disjoint object sizes
is assumed not to be able to overflow, leaving just one OOM code path.
|
|
morally, for null pointers a and b, a-b, a<b, and a>b should all be
defined as 0; however, C does not define any of them.
the stdio implementation makes heavy use of such pointer comparison
and subtraction for buffer logic, and also uses null pos/base/end
pointers to indicate that the FILE is not in the corresponding (read
or write) mode ready for accesses through the buffer.
all of the comparisons are fixed trivially by using != in place of the
relational operators, since the opposite relation (e.g. pos>end) is
logically impossible. the subtractions have been reviewed to check
that they are conditional the stream being in the appropriate reading-
or writing-through-buffer mode, with checks added where needed.
in fgets and getdelim, the checks added should improve performance for
unbuffered streams by avoiding a do-nothing call to memchr, and should
be negligible for buffered streams.
|
|
if EINVAL or ENOMEM happened before the first getc_unlocked, it was
possible that the stream orientation had not yet been set.
|
|
libc.h was intended to be a header for access to global libc state and
related interfaces, but ended up included all over the place because
it was the way to get the weak_alias macro. most of the inclusions
removed here are places where weak_alias was needed. a few were
recently introduced for hidden. some go all the way back to when
libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented)
cancellation points had to include it.
remaining spurious users are mostly callers of the LOCK/UNLOCK macros
and files that use the LFS64 macro to define the awful *64 aliases.
in a few places, new inclusion of libc.h is added because several
internal headers no longer implicitly include libc.h.
declarations for __lockfile and __unlockfile are moved from libc.h to
stdio_impl.h so that the latter does not need libc.h. putting them in
libc.h made no sense at all, since the macros in stdio_impl.h are
needed to use them correctly anyway.
|
|
|
|
previously, getdelim was allocating twice the space needed every time
it expanded its buffer to implement exponential buffer growth (in
order to avoid quadratic run time). however, this doubling was
performed even when the final buffer length needed was already known,
which is the common case that occurs whenever the delimiter is in the
FILE's buffer.
this patch makes two changes to remedy the situation:
1. over-allocation is no longer performed if the delimiter has already
been found when realloc is needed.
2. growth factor is reduced from 2x to 1.5x to reduce the relative
excess allocation in cases where the delimiter is not initially in the
buffer, including unbuffered streams.
in theory these changes could lead to quadratic time if the same
buffer is reused to process a sequence of lines successively
increasing in length, but once this length exceeds the stdio buffer
size, the delimiter will not be found in the buffer right away and
exponential growth will still kick in.
|
|
getdelim was updating *n, the caller's stored buffer size, before
calling realloc. if getdelim then failed due to realloc failure, the
caller would see in *n a value larger than the actual size of the
allocated block, and use of that value is unsafe. in particular,
passing it again to getdelim is unsafe.
now, temporary storage is used for the desired new size, and *n is not
written until realloc succeeds.
|
|
the buffer enlargement logic here accounted for the terminating null
byte, but not for the possibility of hitting the delimiter in the
buffer-refill code path that uses getc_unlocked, in which case two
additional bytes (the delimiter and the null termination) are written
without another chance to enlarge the buffer.
this patch and the corresponding bug report are by Felix Janda.
|
|
|
|
this header evolved to facilitate the extremely lazy practice of
omitting explicit includes of the necessary headers in individual
stdio source files; not only was this sloppy, but it also increased
build time.
now, stdio_impl.h is only including the headers it needs for its own
use; any further headers needed by source files are included directly
where needed.
|
|
to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.
|
|
for some nonsensical reason, glibc's headers use inline functions that
redirect some of the standard functions to ugly nonstandard names (and
likewise for some of their nonstandard functions).
|
|
the biggest change in this commit is that stdio now uses readv to fill
the caller's buffer and the FILE buffer with a single syscall, and
likewise writev to flush the FILE buffer and write out the caller's
buffer in a single syscall.
making this change required fundamental architectural changes to
stdio, so i also made a number of other improvements in the process:
- the implementation no longer assumes that further io will fail
following errors, and no longer blocks io when the error flag is set
(though the latter could easily be changed back if desired)
- unbuffered mode is no longer implemented as a one-byte buffer. as a
consequence, scanf unreading has to use ungetc, to the unget buffer
has been enlarged to hold at least 2 wide characters.
- the FILE structure has been rearranged to maintain the locations of
the fields that might be used in glibc getc/putc type macros, while
shrinking the structure to save some space.
- error cases for fflush, fseek, etc. should be more correct.
- library-internal macros are used for getc_unlocked and putc_unlocked
now, eliminating some ugly code duplication. __uflow and __overflow
are no longer used anywhere but these macros. switch to read or
write mode is also separated so the code can be better shared, e.g.
with ungetc.
- lots of other small things.
|
|
|