view TODO @ 1542:a8bf1aa21020

Fixed bug #15 SDL_blit_A.mmx-speed.patch.txt -- Speed improvements and a bugfix for the current GCC inline mmx asm code: - Changed some ops and removed some resulting useless ones. - Added some instruction parallelism (some gain) The resulting speed on my Xeon improved upto 35% depending on the function (measured in fps). - Fixed a bug where BlitRGBtoRGBSurfaceAlphaMMX() was setting the alpha component on the destination surfaces (to opaque-alpha) even when the surface had none. SDL_blit_A.mmx-msvc.patch.txt -- MSVC mmx intrinsics version of the same GCC asm code. MSVC compiler tries to parallelize the code and to avoid register stalls, but does not always do a very good job. Per-surface blending MSVC functions run quite a bit faster than their pure-asm counterparts (upto 55% faster for 16bit ones), but the per-pixel blending runs somewhat slower than asm. - BlitRGBtoRGBSurfaceAlphaMMX and BlitRGBtoRGBPixelAlphaMMX (and all variants) can now also handle formats other than (A)RGB8888. Formats like RGBA8888 and some quite exotic ones are allowed -- like RAGB8888, or actually anything having channels aligned on 8bit boundary and full 8bit alpha (for per-pixel alpha blending). The performance cost of this change is virtually 0 for per-surface alpha blending (no extra ops inside the loop) and a single non-MMX op inside the loop for per-pixel blending. In testing, the per-pixel alpha blending takes a ~2% performance hit, but it still runs much faster than the current code in CVS. If necessary, a separate function with this functionality can be made. This code requires Processor Pack for VC6.
author Sam Lantinga <slouken@libsdl.org>
date Wed, 15 Mar 2006 15:39:29 +0000
parents f02e673ffc5f
children f12379c41042
line wrap: on
line source


Wish list for the 1.3 development branch:
http://bugzilla.libsdl.org/

 * Add mousewheel events (new unified event architecture?)
 * DirectInput joystick support needs to be implemented
 * Be able to enumerate and select available audio and video drivers
 * Fullscreen video mode support for MacOS X
 * Explicit vertical retrace wait (maybe separate from SDL_Flip?)
 * Shaped windows, windows without borders
 * Multiple windows, multiple display support
 * SDL_INIT_EVENTTHREAD on Windows and MacOS?
 * Add a timestamp to events
 * Add audio input API
 * Add hardware accelerated scaled blit
 * Add hardware accelerated alpha blits
 * Redesign blitting architecture to allow blit plugins

In the jump from 1.2 to 1.3, we should change the SDL_Rect members to
int and evaluate all the rest of the datatypes.  This is the only place
we should do it though, since the 1.2 series should not break binary
compatibility in this way.

Requests:
 * PCM and CDROM volume control (deprecated, but possible)