summaryrefslogtreecommitdiff
path: root/src/stdio/ungetwc.c
AgeCommit message (Collapse)AuthorFilesLines
2016-04-26fix FILE buffer underflow in ungetwcRich Felker1-3/+3
commit 7e816a6487932cbb3cb71d94b609e50e81f4e5bf (version 1.1.11 release cycle) moved the code that performs wchar_t to multibyte conversion across code that used the resulting length in bytes, thereby breaking the unget buffer space check in ungetwc and clobbering up to three bytes below the start of the buffer. for allocated FILEs (all read-enabled FILEs except stdin), the underflow clobbers at most the FILE-specific locale pointer. no stores are performed through this pointer, but subsequent loads may result in a crash or mismatching encoding rule (UTF-8 multibyte vs byte-based). for stdin, the buffer lies in .bss and the underflow may clobber another object. in practice, for libc.so the adjacent object seems to be stderr's buffer, which is completely unused, but this could vary with linking options, or when static linking. applications which do not attempt to use more than one character of ungetwc pushback, or which do not use ungetwc, are not affected.
2015-06-16byte-based C locale, phase 2: stdio and iconv (multibyte callers)Rich Felker1-0/+5
this patch adjusts libc components which use the multibyte functions internally, and which depend on them operating in a particular encoding, to make the appropriate locale changes before calling them and restore the calling thread's locale afterwards. activating the byte-based C locale without these changes would cause regressions in stdio and iconv. in the case of iconv, the current implementation was simply using the multibyte functions as UTF-8 conversions. setting a multibyte UTF-8 locale for the duration of the iconv operation allows the code to continue working. in the case of stdio, POSIX requires that FILE streams have an encoding rule bound at the time of setting wide orientation. as long as all locales, including the C locale, used the same encoding, treating high bytes as UTF-8, there was no need to store an encoding rule as part of the stream's state. a new locale field in the FILE structure points to the locale that should be made active during fgetwc/fputwc/ungetwc on the stream. it cannot point to the locale active at the time the stream becomes oriented, because this locale could be mutable (the global locale) or could be destroyed (locale_t objects produced by newlocale) before the stream is closed. instead, a pointer to the static C or C.UTF-8 locale object added in commit commit aeeac9ca5490d7d90fe061ab72da446c01ddf746 is used. this is valid since categories other than LC_CTYPE will not affect these functions.
2015-06-13fix idiom for setting stdio stream orientation to wideRich Felker1-1/+1
the old idiom, f->mode |= f->mode+1, was adapted from the idiom for setting byte orientation, f->mode |= f->mode-1, but the adaptation was incorrect. unless the stream was alreasdy set byte-oriented, this code incremented f->mode each time it was executed, which would eventually lead to overflow. it could be fixed by changing it to f->mode |= 1, but upcoming changes will require slightly more work at the time of wide orientation, so it makes sense to just call fwide. as an optimization in the single-character functions, fwide is only called if the stream is not already wide-oriented.
2015-06-06remove another invalid skip of locking in ungetwcRich Felker1-3/+1
2015-06-06remove invalid skip of locking in ungetwcRich Felker1-6/+3
aside from being invalid, the early check only optimized the error case, and likely pessimized the common case by separating the two branches on isascii(c) at opposite ends of the function.
2015-05-29fix failure of ungetc and ungetwc to work on files in eof statusRich Felker1-1/+2
these functions were written to handle clearing eof status, but failed to account for the __toread function's handling of eof. with this patch applied, __toread still returns EOF when the file is in eof status, so that read operations will fail, but it also sets up valid buffer pointers for read mode, which are set to the end of the buffer rather than the beginning in order to make the whole buffer available to ungetc/ungetwc. minor changes to __uflow were needed since it's now possible to have non-zero buffer pointers while in eof status. as made, these changes remove a 'fast path' bypassing the function call to __toread, which could be reintroduced with slightly different logic, but since ordinary files have a syscall in f->read, optimizing the code path does not seem worthwhile. the __stdio_read function is also updated not to zero the read buffer pointers on eof/error. while not necessary for correctness, this change avoids the overhead of calling __toread in ungetc after reaching eof, and it also reduces code size and increases consistency with the fmemopen read operation which does not zero the pointers.
2012-11-08clean up stdio_impl.hRich Felker1-0/+4
this header evolved to facilitate the extremely lazy practice of omitting explicit includes of the necessary headers in individual stdio source files; not only was this sloppy, but it also increased build time. now, stdio_impl.h is only including the headers it needs for its own use; any further headers needed by source files are included directly where needed.
2011-03-28major stdio overhaul, using readv/writev, plus other changesRich Felker1-16/+1
the biggest change in this commit is that stdio now uses readv to fill the caller's buffer and the FILE buffer with a single syscall, and likewise writev to flush the FILE buffer and write out the caller's buffer in a single syscall. making this change required fundamental architectural changes to stdio, so i also made a number of other improvements in the process: - the implementation no longer assumes that further io will fail following errors, and no longer blocks io when the error flag is set (though the latter could easily be changed back if desired) - unbuffered mode is no longer implemented as a one-byte buffer. as a consequence, scanf unreading has to use ungetc, to the unget buffer has been enlarged to hold at least 2 wide characters. - the FILE structure has been rearranged to maintain the locations of the fields that might be used in glibc getc/putc type macros, while shrinking the structure to save some space. - error cases for fflush, fseek, etc. should be more correct. - library-internal macros are used for getc_unlocked and putc_unlocked now, eliminating some ugly code duplication. __uflow and __overflow are no longer used anywhere but these macros. switch to read or write mode is also separated so the code can be better shared, e.g. with ungetc. - lots of other small things.
2011-03-25fix all implicit conversion between signed/unsigned pointersRich Felker1-1/+1
sadly the C language does not specify any such implicit conversion, so this is not a matter of just fixing warnings (as gcc treats it) but actual errors. i would like to revisit a number of these changes and possibly revise the types used to reduce the number of casts required.
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker1-0/+45