This beats the previous 3-LUT version and even beats SSE on my system.
Test code:
---
int main( int argc, char *argv[] )
{
SDL_Surface *orig = SDL_LoadPNG("testyuv.png");
SDL_Surface *surf16 = SDL_ConvertSurface(orig, SDL_PIXELFORMAT_RGB565);
SDL_Surface *surf32 = SDL_ConvertSurface(surf16, SDL_PIXELFORMAT_ARGB8888);
Uint64 then = SDL_GetTicks();
for (int i = 0; i < 100000; ++i) {
SDL_BlitSurface(surf16, NULL, surf32, NULL);
}
Uint64 now = SDL_GetTicks();
SDL_Log("Blit took %d ms\n", (int)(now - then));
return 0;
}
---
Results on my system:
BlitNtoN: Blit took 34522 ms
Blit_RGB565_32 (3 LUT): Blit took 9316 ms
Blit_RGB565_32 (1 LUT): Blit took 5268 ms
Blit_RGB565_32_SSE41: Blit took 6399 ms
Tray events on *nix platforms usually run over DBus, and the events subsequently aren't delivered via the window event queue. As a result, SDL_WaitEvent() won't unblock when tray events arrive, particularly if there is no currently active window.
Wake up periodically to poll when tray items are active to avoid blocking the delivery and processing of tray events.
While in a modal loop, the size in WM_WINDOWPOSCHANGING/WM_WINDOWPOSCHANGED may only be updated if the window is being resized interactively. Set the SWP_NOSIZE flag if the size hasn't changed from the last move/size event, or a size set programmatically may end up being overwritten by old size data.
When iterating over the keymap entries, a valid xkb state object has already been allocated, so use that instead of allocating/destroying a new state object for every lookup, which avoids a calloc/free operation inside libxkbcommon. Any state set by the level lookup function will be overwritten with valid state after keymap iteration has completed.
The spec doesn't guarantee that a modifier event won't arrive before a keymap event, or that it will always be sent after a keymap change if the modifiers and layout index haven't changed, so restore any valid state after allocation when building a new keymap.
cl.exe versions ≥v19.41 call this builtin for double → uint64_t
conversions on x86. SDL currently needs such conversions in:
* MainCallbackRateHintChanged()
* SDL_PrintFloat()
* WIN_ApplyWindowProgress()
This seems enough to justify implementing this function rather than
trying to work around it, as it was done in sdl12-compat:
https://github.com/libsdl-org/sdl12-compat/issues/352
This implementation was taken from ReactOS:
https://git.reactos.org/?p=reactos.git;a=commitdiff;h=f637e6b809adb5e0ae420ef4f80c73b19172a2e7
Passes the stdlib testautomation, and also matches the behavior of
Microsoft's 64-bit libc for the currently implementation-defined case
of calling SDL_PrintFloat() with values >SDL_MAX_UINT64.
This is probably something we already cleaned up that has something running
in an unexpected order now that we've moved disconnect work to the main thread.