SDL_strcasecmp (even when calling into a C runtime) does not work with
Unicode chars, and depending on the user's locale, might not work with
even basic ASCII strings.
This implements the function from scratch, using "case-folding,"
which is a more robust method that deals with various languages. It
involves a hashtable of a few hundred codepoints that are "uppercase" and
how to map them to lowercase equivalents (possibly increasing the size of
the string in the process). The vast majority of human languages (and
Unicode) do not have letters with different cases, but still, this static
table takes about 10 kilobytes on a 64-bit machine.
Even this will fail in one known case: the Turkish 'i' folds differently
if you're writing in Turkish vs other languages. Generally this is seen as
unfortunate collateral damage in cases where you can't specify the language
in use.
In addition to case-folding the codepoints, the new functions also know how
to decode the various formats to turn them into codepoints in the first
place, instead of blindly stepping by one byte (or one wchar_t) per
character.
Also included is casefolding.txt from the Unicode Consortium and a perl
script to generate the hashtable from that text file, so we can trivially
update this if new languages are added in the future.
A simple test using the new function:
```c
#include <SDL3/SDL.h>
int main(void)
{
const char *a = "α ε η";
const char *b = "Α Ε Η";
SDL_Log(" strcasecmp(\"%s\", \"%s\") == %d\n", a, b, strcasecmp(a, b));
SDL_Log("SDL_strcasecmp(\"%s\", \"%s\") == %d\n", a, b, SDL_strcasecmp(a, b));
return 0;
}
```
Produces:
```
INFO: strcasecmp("α ε η", "Α Ε Η") == 32
INFO: SDL_strcasecmp("α ε η", "Α Ε Η") == 0
```
glibc strcasecmp() fails to compare a Greek lowercase string to its uppercase
equivalent, even with a UTF-8 locale, but SDL_strcasecmp() works.
Other SDL_stdinc.h functions are changed to be more consistent, which is to
say they now ignore any C runtime and often dictate that only English-based
low-ASCII works with them.
Fixes Issue #9313.
Modern C runtimes have well optimized memset and memcpy, so use those instead of dispatching into SDL's versions. In addition, some compilers can analyze memset and memcpy calls and directly turn them into optimized assembly.
warning C28251: Inconsistent annotation for 'SDL_vsscanf_REAL': this instance has no annotations. See c:\projects\sdl-experimental\include\sdl3\sdl_stdinc.h(597).
warning C28252: Inconsistent annotation for 'SDL_vsnprintf_REAL': _Param_(3) has 'SAL_IsFormatString("printf")' on the prior instance. See c:\projects\sdl-experimental\include\sdl3\sdl_stdinc.h(600).
warning C28253: Inconsistent annotation for 'SDL_vswprintf_REAL': _Param_(3) has 'SAL_IsFormatString("printf")' on this instance. See c:\projects\sdl-experimental\include\sdl3\sdl_stdinc.h(601).
warning C28251: Inconsistent annotation for 'SDL_vasprintf_REAL': this instance has no annotations. See c:\projects\sdl-experimental\include\sdl3\sdl_stdinc.h(603).
I updated .clang-format and ran clang-format 14 over the src and test directories to standardize the code base.
In general I let clang-format have it's way, and added markup to prevent formatting of code that would break or be completely unreadable if formatted.
The script I ran for the src directory is added as build-scripts/clang-format-src.sh
This fixes:
#6592#6593#6594
* Add braces after if conditions
* More add braces after if conditions
* Add braces after while() conditions
* Fix compilation because of macro being modified
* Add braces to for loop
* Add braces after if/goto
* Move comments up
* Remove extra () in the 'return ...;' statements
* More remove extra () in the 'return ...;' statements
* More remove extra () in the 'return ...;' statements after merge
* Fix inconsistent patterns are xxx == NULL vs !xxx
* More "{}" for "if() break;" and "if() continue;"
* More "{}" after if() short statement
* More "{}" after "if () return;" statement
* More fix inconsistent patterns are xxx == NULL vs !xxx
* Revert some modificaion on SDL_RLEaccel.c
* SDL_RLEaccel: no short statement
* Cleanup 'if' where the bracket is in a new line
* Cleanup 'while' where the bracket is in a new line
* Cleanup 'for' where the bracket is in a new line
* Cleanup 'else' where the bracket is in a new line
strchr and strrchr return a pointer to the first/last occurrence of a
character in a string, or NULL if the character is not found. According
to the C standard, the final null terminator is part of the string, and
it should thus be possible to get a pointer to the final null with
these functions. The fallback implementations of SDL_strchr and
SDL_strrchr would always return NULL if trying to find '\0', and this
commit fixes that.
This is done such that we can disable LTO for these 2 functions when
building with MSVC.
This is due to a limitation of Link Time Code Generation (LTCG).
Code generation might generate a new reference to memset after linking
has started. The LTCG must make assumptions about where memset is
defined which is normally the C runtime.