summaryrefslogtreecommitdiff
path: root/src/locale
AgeCommit message (Collapse)AuthorFilesLines
2014-07-02properly pass current locale to *_l functions when used internallyRich Felker6-6/+12
this change is presently non-functional since the callees do not yet use their locale argument for anything.
2014-07-02consolidate str[n]casecmp_l into str[n]casecmp source filesRich Felker2-13/+0
this is mainly done for consistency with the ctype functions and to declutter the src/locale directory.
2014-07-02consolidate *_l ctype/wctype functions into their non-_l source filesRich Felker32-204/+0
the main practical purposes of this commit are to remove a huge amount of clutter from the src/locale directory, to cut down on the length of the $(AR) and $(LD) command lines, and to reduce the amount of space wasted by object file headers in the static libc.a. build time may also be reduced, though this has not been measured. as an additional justification, if there ever were a need for the behavior of these functions to vary by locale, it would be necessary for the non-_l versions to call the _l versions, so that linking the former without the latter would not be possible anyway.
2014-07-02add locale frameworkRich Felker5-19/+155
this commit adds non-stub implementations of setlocale, duplocale, newlocale, and uselocale, along with the data structures and minimal code needed for representing the active locale on a per-thread basis and optimizing the common case where thread-local locale settings are not in use. at this point, the data structures only contain what is necessary to represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in finding message translation files). representation for the other categories will be added later; the expectation is that a single pointer will suffice for each. for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any other string is accepted and treated as "C.UTF-8". for other categories, any string is accepted after being truncated to a maximum supported length (currently 15 bytes). for LC_MESSAGES, the name is kept regardless of whether libc itself can use such a message translation locale, since applications using catgets or gettext should be able to use message locales libc is not aware of. for other categories, names which are not successfully loaded as locales (which, at present, means all names) are treated as aliases for "C". setlocale never fails. locale settings are not yet used anywhere, so this commit should have no visible effects except for the contents of the string returned by setlocale.
2014-06-10replace all remaining internal uses of pthread_self with __pthread_selfRich Felker1-1/+1
prior to version 1.1.0, the difference between pthread_self (the public function) and __pthread_self (the internal macro or inline function) was that the former would lazily initialize the thread pointer if it was not already initialized, whereas the latter would crash in this case. since lazy initialization is no longer supported, use of pthread_self no longer makes sense; it simply generates larger, slower code.
2014-05-13add cp437 and cp850 to available iconv conversionsRich Felker2-177/+206
perhaps some additional legacy DOS-era codepages would also be useful to have, but these are the ones for which there has been demand. the size of the diff is due to the fact that legacychars.h is updated in such a way that new characters are inserted into the table in unicode codepoint order; thus other mappings in codepages.h have changed to reflect the new table indices of their characters.
2014-01-23fix an overflow in wcsxfrm when n==0Szabolcs Nagy1-2/+4
posix allows zero length destination
2013-12-12include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy3-3/+1
2013-11-25remove duplicate includes from dynlink.c, strfmon.c and getaddrinfo.cSzabolcs Nagy1-1/+0
2013-08-17remove spurious tmp file present since initial git check-inRich Felker1-390/+0
2013-08-17add hkscs/big5-2003/eten extensions to iconv big5Rich Felker3-977/+1433
with these changes, the character set implemented as "big5" in musl is a pure superset of cp950, the canonical "big5", and agrees with the normative parts of Unicode. this means it has minor differences from both hkscs and big5-2003: - the range A2CC-A2CE maps to CJK ideographs rather than numerals, contrary to changes made in big5-2003. - C6CD maps to a CJK ideograph rather than its corresponding Kangxi radical character, contrary to changes made in hkscs. - F9FE maps to U+2593 rather than U+FFED. of these differences, none but the last are visually distinct, and the last is a character used purely for text-based graphics, not to convey linguistic content. should there be future demand for strict conformance to big5-2003 or hkscs mappings, the present charset aliases can be replaced with distinct variants. reportedly there are other non-standard big5 extensions in common use in Taiwan and perhaps elsewhere, which could also be added as layers on top of the existing big5 support. there may be additional characters which should be added to the hkscs table: the whatwg standard for big5 defines what appears to be a superset of hkscs.
2013-08-07add Big5 charset support to iconvRich Felker2-0/+1066
at this point, it is just the common base charset equivalent to Windows CP 950, with no further extensions. HKSCS and possibly other supersets will be added later. other aliases may need to be added too.
2013-08-05iconv support for legacy Korean encodingsRich Felker2-0/+678
like for other character sets, stateful iso-2022 form is not supported yet but everything else should work. all charset aliases are treated the same, as Windows codepage 949, because reportedly the EUC-KR charset name is in widespread (mis?)usage in email and on the web for data which actually uses the extended characters outside the standard 93x94 grid. this could easily be changed if desired. the principle of this converter for handling the giant bulk of rare Hangul syllables outside of the standard KS X 1001 93x94 grid is the same as the GB18030 converter's treatment of non-explicitly-coded Unicode codepoints: sequences in the extension range are mapped to an integer index N, and the converter explicitly computes the Nth Hangul syllable not explicitly encoded in the character map. empirically, this requires at most 7 passes over the grid. this approach reduces the table size required for Korean legacy encodings from roughly 44k to 17k and should have minimal performance impact on real-world text conversions since the "slow" characters are rare. where it does have impact, the cost is merely a large constant time factor.
2013-07-28fix semantically incorrect use of LC_GLOBAL_LOCALERich Felker5-5/+5
LC_GLOBAL_LOCALE refers to the global locale, controlled by setlocale, not the thread-local locale in effect which these functions should be using. neither LC_GLOBAL_LOCALE nor 0 has an argument to the *_l functions has behavior defined by the standard, but 0 is a more logical choice for requesting the callee to lookup the current locale. in the future I may move the current locale lookup the the caller (the non-_l-suffixed wrapper). at this point, all of the locale logic is dummied out, so no harm was done, but it should at least avoid misleading usage.
2013-07-24rework langinfo code for ABI compat and for use by time codeRich Felker2-9/+8
2013-07-24update strxfrm/wcsxfrm for future LC_COLLATE support and ABI compatRich Felker4-14/+20
2013-07-24add ABI compat aliases for a number of locale_t functionsRich Felker8-0/+24
2013-07-24prepare strcoll/wcscoll for LC_COLLATE support and add ABI symbolsRich Felker4-15/+20
2013-07-24move strftime_l into strftime.c and add __-prefixed versionRich Felker1-7/+0
the latter is both for ABI purposes, and to facilitate eventually adding LC_TIME support. it's also nice to eliminate an extra source file.
2013-06-26fix iconv conversion to legacy 8bit codepagesRich Felker1-2/+2
this seems to have been a simple copy-and-paste error from the code for converting from legacy codepages.
2012-09-06use restrict everywhere it's required by c99 and/or posix 2008Rich Felker7-8/+8
to deal with the fact that the public headers may be used with pre-c99 compilers, __restrict is used in place of restrict, and defined appropriately for any supported compiler. we also avoid the form [restrict] since older versions of gcc rejected it due to a bug in the original c99 standard, and instead use the form *restrict.
2012-06-20duplocale: don't crash when called with LC_GLOBAL_LOCALERich Felker1-1/+1
posix has resolved to add this usage; for now, we just avoid writing anything to the new locale object since it's not used anyway.
2012-06-19fix localeconv values and implementationRich Felker1-15/+28
dynamic-allocation of the structure is not valid; it can crash an application if malloc fails. since localeconv is not specified to have failure conditions, the object needs to have static storage duration. need to review whether all the values are right or not still..
2012-06-18fix multiple iconv bugs reading utf-16/32 and wchar_tRich Felker1-8/+8
2012-06-18fix iconv dest utf-16: unavailable chars must be replaced; EILSEQ is wrongRich Felker1-2/+2
2012-06-18fix erroneous utf-16 encoding with surrogates in iconvRich Felker1-0/+1
apparently this was never tested before.
2012-04-21fix major breakage in iconv, bogus rejecting of dest charsetsRich Felker1-1/+1
2012-03-25add strfmon_l variant (still mostly incomplete)Rich Felker1-3/+27
2012-03-21initial, very primitive strfmonRich Felker1-0/+77
2012-03-01add all missing wchar functions except floating point parsersRich Felker2-0/+12
these are mostly untested and adapted directly from corresponding byte string functions and similar.
2012-02-06more locale_t interfaces (string stuff) and header updatesRich Felker7-0/+48
this should be everything except for some functions where the non-_l version isn't even implemented yet (mainly some non-ISO-C wcs* functions).
2012-02-06fix some omissions and mistakes in locale_t interface definitionsRich Felker13-13/+13
2012-02-06add more of the locale_t interfaces, all dummied out to ignore the localeRich Felker18-0/+108
2011-07-12gb18030 support in iconv (only from, not to)Rich Felker2-2/+1887
also support (and restrict to subsets) older chinese sets, and explicitly refuse to convert to cjk (since there's no code for it yet)
2011-07-12legacy japanese charset support in iconv (only from, not to)Rich Felker2-0/+597
2011-07-12simplify iconv and support more legacy codepagesRich Felker3-352/+331
2011-07-03iconv was not returning -1 on most failureRich Felker1-0/+2
this broke most uses of iconv in real-world programs, especially glib's iconv wrappers.
2011-05-30implement uselocale function (minimal)Rich Felker1-0/+10
2011-04-07fix breakage due to converting a return type to size_t in iconv...Rich Felker1-1/+1
2011-04-03fix nl_langinfo to actually use the existing, correct internal versionRich Felker2-15/+5
2011-03-25fix all implicit conversion between signed/unsigned pointersRich Felker1-11/+11
sadly the C language does not specify any such implicit conversion, so this is not a matter of just fixing warnings (as gcc treats it) but actual errors. i would like to revisit a number of these changes and possibly revise the types used to reduce the number of casts required.
2011-02-13use a more-correct integer type, and silence 64-bit warnings as a bonusRich Felker1-2/+2
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker31-0/+1292