summaryrefslogtreecommitdiff
path: root/src/multibyte/internal.h
AgeCommit message (Collapse)AuthorFilesLines
2016-06-21remove comments on copyright status from UTF-8 implementation filesRich Felker1-6/+0
despite clarifications made to the COPYRIGHT file in commit f0a61399330bae42beeb27d6ecd05570b3382a60, there continues to be confusion about whether the permissions granted actually apply to all files. I am the sole author of these files and clearly intend, and have always intended, for the grant of permission to apply to them.
2015-07-25fix undefined left-shift of negative values in utf-8 state tableRich Felker1-1/+1
2015-06-16byte-based C locale, phase 1: multibyte character handling functionsRich Felker1-0/+7
this patch makes the functions which work directly on multibyte characters treat the high bytes as individual abstract code units rather than as multibyte sequences when MB_CUR_MAX is 1. since MB_CUR_MAX is presently defined as a constant 4, all of the new code added is dead code, and optimizing compilers' code generation should not be affected at all. a future commit will activate the new code. as abstract code units, bytes 0x80 to 0xff are represented by wchar_t values 0xdf80 to 0xdfff, at the end of the surrogates range. this ensures that they will never be misinterpreted as Unicode characters, and that all wctype functions return false for these "characters" without needing locale-specific logic. a high range outside of Unicode such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's char16_t also needs to be able to represent conversions of these bytes, the surrogate range was the natural choice.
2015-04-22remove libc.h dependency from otherwise-independent multibyte codeRich Felker1-2/+4
2013-12-12include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy1-0/+1
2013-04-08fix out-of-bounds access in UTF-8 decodingRich Felker1-1/+1
SA and SB are used as the lowest and highest valid starter bytes, but the value of SB was one-past the last valid starter. this caused access past the end of the state table when the illegal byte '\xf5' was encountered in a starter position. the error did not show up in full-character decoding tests, since the bogus state read from just past the table was unlikely to admit any continuation bytes as valid, but would have shown up had we tested feeding '\xf5' to the byte-at-a-time decoding in mbrtowc: it would cause the funtion to wrongly return -2 rather than -1. I may eventually go back and remove all references to SA and SB, replacing them with the values; this would make the code more transparent, I think. the original motivation for using macros was to allow misguided users of the code to redefine them for the purpose of enlarging the set of accepted sequences past the end of Unicode...
2012-02-23cleanup and work around visibility bug in gcc 3 that affects x86_64Rich Felker1-5/+3
in gcc 3, the visibility attribute must be placed on both the declaration and on the definition. if it's omitted from the definition, the compiler fails to emit the ".hidden" directive in the assembly, and the linker will either generate textrels (if supported, such as on i386) or refuse to link (on targets where certain types of textrels are forbidden or impossible without further assumptions about memory layout, such as on x86_64). this patch also unifies the decision about when to use visibility into libc.h and makes the visibility in the utf-8 state machine tables based on libc.h rather than a duplicate test.
2011-02-27cleanup utf-8 multibyte code, use visibility if possibleRich Felker1-41/+4
this code was written independently of musl, with support for a the backwards, nonstandard "31-bit unicode" some libraries/apps might want. unfortunately the extra code (inside #ifdef) makes the source harder to read and makes code that should be simple look complex, so i'm removing it. anyone who wants to use the old code can find it in the history or from elsewhere. also, change the visibility of the __fsmu8 state machine table to hidden, if supported. this should improve performance slightly in shared-library builds.
2011-02-13cleanup multibyte stuff to remove ugly casts, sanitize the ptr align castsRich Felker1-4/+4
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker1-0/+61