annotate src/video/SDL_yuv_mmx.c @ 4216:5b99971a27b4 SDL-1.2

Fixed bug #698 Hans de Goede 2009-02-13 01:10:52 PST Since the new "glitch free" version of pulseaudio (used in Fedora 10 amongst others), the sound of SDL using apps (like a simple playmus call) has been crackling. While looking in to fixing this I noticed that the current pulseaudio code in SDL uses pa_simple. However pa_simple uses a thread to pump pulseaudio events and ipc, given that SDL already has its own thread for audio handling this is clearly suboptimal, leading to unnecessary context switching IPC, etc. Also pa_simple does not allow one to implement the WaitAudio() callback for SDL audiodrivers properly. Given that my work is mostly a rewrite (although some original pieces remain) I'm attaching the new .c and .h file, as that is easier to review then the huge diff. Let me know if you also want the diff. This new version has the following features: -no longer use an additional thread next to the SDL sound thread -do not crackle with glitch free audio -when used with a newer pulse, which does glitch free audio, the total latency is the same as with the alsa driver -proper WaitAudio() implementation, saving another mixlen worth of latency -adds a WaitDone() implementation This patch has been written in consultancy with Lennart Poetering (the pulseaudio author) and has been reviewed by him for correct use of the pa API.
author Sam Lantinga <slouken@libsdl.org>
date Mon, 21 Sep 2009 09:27:08 +0000
parents a1b03ba2fcd0
children
rev   line source
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
1 /*
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
2 SDL - Simple DirectMedia Layer
4159
a1b03ba2fcd0 Updated copyright date
Sam Lantinga <slouken@libsdl.org>
parents: 4064
diff changeset
3 Copyright (C) 1997-2009 Sam Lantinga
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
5 This library is free software; you can redistribute it and/or
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
6 modify it under the terms of the GNU Lesser General Public
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
7 License as published by the Free Software Foundation; either
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
8 version 2.1 of the License, or (at your option) any later version.
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
9
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
10 This library is distributed in the hope that it will be useful,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
11 but WITHOUT ANY WARRANTY; without even the implied warranty of
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
12 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
13 Lesser General Public License for more details.
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
14
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
15 You should have received a copy of the GNU Lesser General Public
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
16 License along with this library; if not, write to the Free Software
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
17 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
18
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
19 Sam Lantinga
252
e8157fcb3114 Updated the source with the correct e-mail address
Sam Lantinga <slouken@libsdl.org>
parents: 0
diff changeset
20 slouken@libsdl.org
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
21 */
1402
d910939febfa Use consistent identifiers for the various platforms we support.
Sam Lantinga <slouken@libsdl.org>
parents: 1361
diff changeset
22 #include "SDL_config.h"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
23
4064
940fddb81bea Mac OS X/x86 won't build the MMX/YUV inline assembly without optimizations
Ryan C. Gordon <icculus@icculus.org>
parents: 4062
diff changeset
24 #if (__GNUC__ > 2) && defined(__i386__) && __OPTIMIZE__ && SDL_ASSEMBLY_ROUTINES
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
25
1407
0c6941483cc6 Whoops, forgot to check in this fix
Sam Lantinga <slouken@libsdl.org>
parents: 1402
diff changeset
26 #include "SDL_stdinc.h"
0c6941483cc6 Whoops, forgot to check in this fix
Sam Lantinga <slouken@libsdl.org>
parents: 1402
diff changeset
27
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
28 #include "mmx.h"
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
29
4047
810c6f4ab7aa Merged r3207:3208 from trunk/SDL: *INDENT-OFF* for inline asm.
Ryan C. Gordon <icculus@icculus.org>
parents: 4046
diff changeset
30 /* *INDENT-OFF* */
810c6f4ab7aa Merged r3207:3208 from trunk/SDL: *INDENT-OFF* for inline asm.
Ryan C. Gordon <icculus@icculus.org>
parents: 4046
diff changeset
31
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
32 static mmx_t MMX_0080w = { .ud = {0x00800080, 0x00800080} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
33 static mmx_t MMX_00FFw = { .ud = {0x00ff00ff, 0x00ff00ff} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
34 static mmx_t MMX_FF00w = { .ud = {0xff00ff00, 0xff00ff00} };
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
35
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
36 static mmx_t MMX_Ycoeff = { .uw = {0x004a, 0x004a, 0x004a, 0x004a} };
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
37
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
38 static mmx_t MMX_UbluRGB = { .uw = {0x0072, 0x0072, 0x0072, 0x0072} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
39 static mmx_t MMX_VredRGB = { .uw = {0x0059, 0x0059, 0x0059, 0x0059} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
40 static mmx_t MMX_UgrnRGB = { .uw = {0xffea, 0xffea, 0xffea, 0xffea} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
41 static mmx_t MMX_VgrnRGB = { .uw = {0xffd2, 0xffd2, 0xffd2, 0xffd2} };
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
42
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
43 static mmx_t MMX_Ublu5x5 = { .uw = {0x0081, 0x0081, 0x0081, 0x0081} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
44 static mmx_t MMX_Vred5x5 = { .uw = {0x0066, 0x0066, 0x0066, 0x0066} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
45 static mmx_t MMX_Ugrn565 = { .uw = {0xffe8, 0xffe8, 0xffe8, 0xffe8} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
46 static mmx_t MMX_Vgrn565 = { .uw = {0xffcd, 0xffcd, 0xffcd, 0xffcd} };
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
47
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
48 static mmx_t MMX_red565 = { .uw = {0xf800, 0xf800, 0xf800, 0xf800} };
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
49 static mmx_t MMX_grn565 = { .uw = {0x07e0, 0x07e0, 0x07e0, 0x07e0} };
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
50
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
51 /**
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
52 This MMX assembler is my first assembler/MMX program ever.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
53 Thus it maybe buggy.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
54 Send patches to:
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
55 mvogt@rhrk.uni-kl.de
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
56
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
57 After it worked fine I have "obfuscated" the code a bit to have
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
58 more parallism in the MMX units. This means I moved
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
59 initilisation around and delayed other instruction.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
60 Performance measurement did not show that this brought any advantage
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
61 but in theory it _should_ be faster this way.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
62
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
63 The overall performanve gain to the C based dither was 30%-40%.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
64 The MMX routine calculates 256bit=8RGB values in each cycle
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
65 (4 for row1 & 4 for row2)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
66
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
67 The red/green/blue.. coefficents are taken from the mpeg_play
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
68 player. They look nice, but I dont know if you can have
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
69 better values, to avoid integer rounding errors.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
70
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
71
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
72 IMPORTANT:
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
73 ==========
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
74
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
75 It is a requirement that the cr/cb/lum are 8 byte aligned and
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
76 the out are 16byte aligned or you will/may get segfaults
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
77
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
78 */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
79
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
80 void ColorRGBDitherYV12MMX1X( int *colortab, Uint32 *rgb_2_pix,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
81 unsigned char *lum, unsigned char *cr,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
82 unsigned char *cb, unsigned char *out,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
83 int rows, int cols, int mod )
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
84 {
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
85 Uint32 *row1;
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
86 Uint32 *row2;
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
87
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
88 unsigned char* y = lum +cols*rows; // Pointer to the end
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
89 int x = 0;
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
90 row1 = (Uint32 *)out; // 32 bit target
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
91 row2 = (Uint32 *)out+cols+mod; // start of second row
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
92 mod = (mod+cols+mod)*4; // increment for row1 in byte
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
93
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
94 __asm__ __volatile__ (
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
95 // tap dance to workaround the inability to use %%ebx at will...
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
96 // move one thing to the stack...
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
97 "pushl $0\n" // save a slot on the stack.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
98 "pushl %%ebx\n" // save %%ebx.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
99 "movl %0, %%ebx\n" // put the thing in ebx.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
100 "movl %%ebx,4(%%esp)\n" // put the thing in the stack slot.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
101 "popl %%ebx\n" // get back %%ebx (the PIC register).
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
102
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
103 ".align 8\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
104 "1:\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
105
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
106 // create Cr (result in mm1)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
107 "pushl %%ebx\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
108 "movl 4(%%esp),%%ebx\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
109 "movd (%%ebx),%%mm1\n" // 0 0 0 0 v3 v2 v1 v0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
110 "popl %%ebx\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
111 "pxor %%mm7,%%mm7\n" // 00 00 00 00 00 00 00 00
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
112 "movd (%2), %%mm2\n" // 0 0 0 0 l3 l2 l1 l0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
113 "punpcklbw %%mm7,%%mm1\n" // 0 v3 0 v2 00 v1 00 v0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
114 "punpckldq %%mm1,%%mm1\n" // 00 v1 00 v0 00 v1 00 v0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
115 "psubw %9,%%mm1\n" // mm1-128:r1 r1 r0 r0 r1 r1 r0 r0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
116
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
117 // create Cr_g (result in mm0)
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
118 "movq %%mm1,%%mm0\n" // r1 r1 r0 r0 r1 r1 r0 r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
119 "pmullw %10,%%mm0\n" // red*-46dec=0.7136*64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
120 "pmullw %11,%%mm1\n" // red*89dec=1.4013*64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
121 "psraw $6, %%mm0\n" // red=red/64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
122 "psraw $6, %%mm1\n" // red=red/64
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
123
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
124 // create L1 L2 (result in mm2,mm4)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
125 // L2=lum+cols
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
126 "movq (%2,%4),%%mm3\n" // 0 0 0 0 L3 L2 L1 L0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
127 "punpckldq %%mm3,%%mm2\n" // L3 L2 L1 L0 l3 l2 l1 l0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
128 "movq %%mm2,%%mm4\n" // L3 L2 L1 L0 l3 l2 l1 l0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
129 "pand %12,%%mm2\n" // L3 0 L1 0 l3 0 l1 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
130 "pand %13,%%mm4\n" // 0 L2 0 L0 0 l2 0 l0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
131 "psrlw $8,%%mm2\n" // 0 L3 0 L1 0 l3 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
132
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
133 // create R (result in mm6)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
134 "movq %%mm2,%%mm5\n" // 0 L3 0 L1 0 l3 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
135 "movq %%mm4,%%mm6\n" // 0 L2 0 L0 0 l2 0 l0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
136 "paddsw %%mm1, %%mm5\n" // lum1+red:x R3 x R1 x r3 x r1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
137 "paddsw %%mm1, %%mm6\n" // lum1+red:x R2 x R0 x r2 x r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
138 "packuswb %%mm5,%%mm5\n" // R3 R1 r3 r1 R3 R1 r3 r1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
139 "packuswb %%mm6,%%mm6\n" // R2 R0 r2 r0 R2 R0 r2 r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
140 "pxor %%mm7,%%mm7\n" // 00 00 00 00 00 00 00 00
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
141 "punpcklbw %%mm5,%%mm6\n" // R3 R2 R1 R0 r3 r2 r1 r0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
142
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
143 // create Cb (result in mm1)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
144 "movd (%1), %%mm1\n" // 0 0 0 0 u3 u2 u1 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
145 "punpcklbw %%mm7,%%mm1\n" // 0 u3 0 u2 00 u1 00 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
146 "punpckldq %%mm1,%%mm1\n" // 00 u1 00 u0 00 u1 00 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
147 "psubw %9,%%mm1\n" // mm1-128:u1 u1 u0 u0 u1 u1 u0 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
148 // create Cb_g (result in mm5)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
149 "movq %%mm1,%%mm5\n" // u1 u1 u0 u0 u1 u1 u0 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
150 "pmullw %14,%%mm5\n" // blue*-109dec=1.7129*64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
151 "pmullw %15,%%mm1\n" // blue*114dec=1.78125*64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
152 "psraw $6, %%mm5\n" // blue=red/64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
153 "psraw $6, %%mm1\n" // blue=blue/64
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
154
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
155 // create G (result in mm7)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
156 "movq %%mm2,%%mm3\n" // 0 L3 0 L1 0 l3 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
157 "movq %%mm4,%%mm7\n" // 0 L2 0 L0 0 l2 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
158 "paddsw %%mm5, %%mm3\n" // lum1+Cb_g:x G3t x G1t x g3t x g1t
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
159 "paddsw %%mm5, %%mm7\n" // lum1+Cb_g:x G2t x G0t x g2t x g0t
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
160 "paddsw %%mm0, %%mm3\n" // lum1+Cr_g:x G3 x G1 x g3 x g1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
161 "paddsw %%mm0, %%mm7\n" // lum1+blue:x G2 x G0 x g2 x g0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
162 "packuswb %%mm3,%%mm3\n" // G3 G1 g3 g1 G3 G1 g3 g1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
163 "packuswb %%mm7,%%mm7\n" // G2 G0 g2 g0 G2 G0 g2 g0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
164 "punpcklbw %%mm3,%%mm7\n" // G3 G2 G1 G0 g3 g2 g1 g0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
165
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
166 // create B (result in mm5)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
167 "movq %%mm2,%%mm3\n" // 0 L3 0 L1 0 l3 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
168 "movq %%mm4,%%mm5\n" // 0 L2 0 L0 0 l2 0 l1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
169 "paddsw %%mm1, %%mm3\n" // lum1+blue:x B3 x B1 x b3 x b1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
170 "paddsw %%mm1, %%mm5\n" // lum1+blue:x B2 x B0 x b2 x b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
171 "packuswb %%mm3,%%mm3\n" // B3 B1 b3 b1 B3 B1 b3 b1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
172 "packuswb %%mm5,%%mm5\n" // B2 B0 b2 b0 B2 B0 b2 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
173 "punpcklbw %%mm3,%%mm5\n" // B3 B2 B1 B0 b3 b2 b1 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
174
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
175 // fill destination row1 (needed are mm6=Rr,mm7=Gg,mm5=Bb)
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
176
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
177 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
178 "pxor %%mm4,%%mm4\n" // 0 0 0 0 0 0 0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
179 "movq %%mm6,%%mm1\n" // R3 R2 R1 R0 r3 r2 r1 r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
180 "movq %%mm5,%%mm3\n" // B3 B2 B1 B0 b3 b2 b1 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
181 // process lower lum
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
182 "punpcklbw %%mm4,%%mm1\n" // 0 r3 0 r2 0 r1 0 r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
183 "punpcklbw %%mm4,%%mm3\n" // 0 b3 0 b2 0 b1 0 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
184 "movq %%mm1,%%mm2\n" // 0 r3 0 r2 0 r1 0 r0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
185 "movq %%mm3,%%mm0\n" // 0 b3 0 b2 0 b1 0 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
186 "punpcklwd %%mm1,%%mm3\n" // 0 r1 0 b1 0 r0 0 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
187 "punpckhwd %%mm2,%%mm0\n" // 0 r3 0 b3 0 r2 0 b2
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
188
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
189 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
190 "movq %%mm7,%%mm1\n" // G3 G2 G1 G0 g3 g2 g1 g0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
191 "punpcklbw %%mm1,%%mm2\n" // g3 0 g2 0 g1 0 g0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
192 "punpcklwd %%mm4,%%mm2\n" // 0 0 g1 0 0 0 g0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
193 "por %%mm3, %%mm2\n" // 0 r1 g1 b1 0 r0 g0 b0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
194 "movq %%mm2,(%3)\n" // wrote out ! row1
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
195
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
196 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
197 "punpcklbw %%mm1,%%mm4\n" // g3 0 g2 0 g1 0 g0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
198 "punpckhwd %%mm2,%%mm4\n" // 0 0 g3 0 0 0 g2 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
199 "por %%mm0, %%mm4\n" // 0 r3 g3 b3 0 r2 g2 b2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
200 "movq %%mm4,8(%3)\n" // wrote out ! row1
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
201
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
202 // fill destination row2 (needed are mm6=Rr,mm7=Gg,mm5=Bb)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
203 // this can be done "destructive"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
204 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
205 "punpckhbw %%mm2,%%mm6\n" // 0 R3 0 R2 0 R1 0 R0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
206 "punpckhbw %%mm1,%%mm5\n" // G3 B3 G2 B2 G1 B1 G0 B0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
207 "movq %%mm5,%%mm1\n" // G3 B3 G2 B2 G1 B1 G0 B0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
208 "punpcklwd %%mm6,%%mm1\n" // 0 R1 G1 B1 0 R0 G0 B0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
209 "movq %%mm1,(%5)\n" // wrote out ! row2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
210 "punpckhwd %%mm6,%%mm5\n" // 0 R3 G3 B3 0 R2 G2 B2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
211 "movq %%mm5,8(%5)\n" // wrote out ! row2
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
212
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
213 "addl $4,%2\n" // lum+4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
214 "leal 16(%3),%3\n" // row1+16
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
215 "leal 16(%5),%5\n" // row2+16
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
216 "addl $2,(%%esp)\n" // cr+2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
217 "addl $2,%1\n" // cb+2
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
218
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
219 "addl $4,%6\n" // x+4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
220 "cmpl %4,%6\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
221
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
222 "jl 1b\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
223 "addl %4,%2\n" // lum += cols
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
224 "addl %8,%3\n" // row1+= mod
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
225 "addl %8,%5\n" // row2+= mod
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
226 "movl $0,%6\n" // x=0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
227 "cmpl %7,%2\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
228 "jl 1b\n"
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
229
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
230 "addl $4,%%esp\n" // get rid of the stack slot we reserved.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
231 "emms\n" // reset MMX registers.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
232 :
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
233 : "m" (cr), "r"(cb),"r"(lum),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
234 "r"(row1),"r"(cols),"r"(row2),"m"(x),"m"(y),"m"(mod),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
235 "m"(MMX_0080w),"m"(MMX_VgrnRGB),"m"(MMX_VredRGB),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
236 "m"(MMX_FF00w),"m"(MMX_00FFw),"m"(MMX_UgrnRGB),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
237 "m"(MMX_UbluRGB)
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
238 );
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
239 }
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
240
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
241 void Color565DitherYV12MMX1X( int *colortab, Uint32 *rgb_2_pix,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
242 unsigned char *lum, unsigned char *cr,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
243 unsigned char *cb, unsigned char *out,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
244 int rows, int cols, int mod )
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
245 {
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
246 Uint16 *row1;
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
247 Uint16 *row2;
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
248
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
249 unsigned char* y = lum +cols*rows; /* Pointer to the end */
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
250 int x = 0;
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
251 row1 = (Uint16 *)out; /* 16 bit target */
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
252 row2 = (Uint16 *)out+cols+mod; /* start of second row */
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
253 mod = (mod+cols+mod)*2; /* increment for row1 in byte */
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
254
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
255 __asm__ __volatile__(
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
256 // tap dance to workaround the inability to use %%ebx at will...
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
257 // move one thing to the stack...
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
258 "pushl $0\n" // save a slot on the stack.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
259 "pushl %%ebx\n" // save %%ebx.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
260 "movl %0, %%ebx\n" // put the thing in ebx.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
261 "movl %%ebx, 4(%%esp)\n" // put the thing in the stack slot.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
262 "popl %%ebx\n" // get back %%ebx (the PIC register).
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
263
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
264 ".align 8\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
265 "1:\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
266 "movd (%1), %%mm0\n" // 4 Cb 0 0 0 0 u3 u2 u1 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
267 "pxor %%mm7, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
268 "pushl %%ebx\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
269 "movl 4(%%esp), %%ebx\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
270 "movd (%%ebx), %%mm1\n" // 4 Cr 0 0 0 0 v3 v2 v1 v0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
271 "popl %%ebx\n"
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
272
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
273 "punpcklbw %%mm7, %%mm0\n" // 4 W cb 0 u3 0 u2 0 u1 0 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
274 "punpcklbw %%mm7, %%mm1\n" // 4 W cr 0 v3 0 v2 0 v1 0 v0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
275 "psubw %9, %%mm0\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
276 "psubw %9, %%mm1\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
277 "movq %%mm0, %%mm2\n" // Cb 0 u3 0 u2 0 u1 0 u0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
278 "movq %%mm1, %%mm3\n" // Cr
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
279 "pmullw %10, %%mm2\n" // Cb2green 0 R3 0 R2 0 R1 0 R0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
280 "movq (%2), %%mm6\n" // L1 l7 L6 L5 L4 L3 L2 L1 L0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
281 "pmullw %11, %%mm0\n" // Cb2blue
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
282 "pand %12, %%mm6\n" // L1 00 L6 00 L4 00 L2 00 L0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
283 "pmullw %13, %%mm3\n" // Cr2green
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
284 "movq (%2), %%mm7\n" // L2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
285 "pmullw %14, %%mm1\n" // Cr2red
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
286 "psrlw $8, %%mm7\n" // L2 00 L7 00 L5 00 L3 00 L1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
287 "pmullw %15, %%mm6\n" // lum1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
288 "paddw %%mm3, %%mm2\n" // Cb2green + Cr2green == green
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
289 "pmullw %15, %%mm7\n" // lum2
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
290
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
291 "movq %%mm6, %%mm4\n" // lum1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
292 "paddw %%mm0, %%mm6\n" // lum1 +blue 00 B6 00 B4 00 B2 00 B0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
293 "movq %%mm4, %%mm5\n" // lum1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
294 "paddw %%mm1, %%mm4\n" // lum1 +red 00 R6 00 R4 00 R2 00 R0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
295 "paddw %%mm2, %%mm5\n" // lum1 +green 00 G6 00 G4 00 G2 00 G0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
296 "psraw $6, %%mm4\n" // R1 0 .. 64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
297 "movq %%mm7, %%mm3\n" // lum2 00 L7 00 L5 00 L3 00 L1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
298 "psraw $6, %%mm5\n" // G1 - .. +
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
299 "paddw %%mm0, %%mm7\n" // Lum2 +blue 00 B7 00 B5 00 B3 00 B1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
300 "psraw $6, %%mm6\n" // B1 0 .. 64
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
301 "packuswb %%mm4, %%mm4\n" // R1 R1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
302 "packuswb %%mm5, %%mm5\n" // G1 G1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
303 "packuswb %%mm6, %%mm6\n" // B1 B1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
304 "punpcklbw %%mm4, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
305 "punpcklbw %%mm5, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
306
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
307 "pand %16, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
308 "psllw $3, %%mm5\n" // GREEN 1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
309 "punpcklbw %%mm6, %%mm6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
310 "pand %17, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
311 "pand %16, %%mm6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
312 "por %%mm5, %%mm4\n" //
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
313 "psrlw $11, %%mm6\n" // BLUE 1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
314 "movq %%mm3, %%mm5\n" // lum2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
315 "paddw %%mm1, %%mm3\n" // lum2 +red 00 R7 00 R5 00 R3 00 R1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
316 "paddw %%mm2, %%mm5\n" // lum2 +green 00 G7 00 G5 00 G3 00 G1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
317 "psraw $6, %%mm3\n" // R2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
318 "por %%mm6, %%mm4\n" // MM4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
319 "psraw $6, %%mm5\n" // G2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
320 "movq (%2, %4), %%mm6\n" // L3 load lum2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
321 "psraw $6, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
322 "packuswb %%mm3, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
323 "packuswb %%mm5, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
324 "packuswb %%mm7, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
325 "pand %12, %%mm6\n" // L3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
326 "punpcklbw %%mm3, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
327 "punpcklbw %%mm5, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
328 "pmullw %15, %%mm6\n" // lum3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
329 "punpcklbw %%mm7, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
330 "psllw $3, %%mm5\n" // GREEN 2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
331 "pand %16, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
332 "pand %16, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
333 "psrlw $11, %%mm7\n" // BLUE 2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
334 "pand %17, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
335 "por %%mm7, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
336 "movq (%2,%4), %%mm7\n" // L4 load lum2
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
337 "por %%mm5, %%mm3\n" //
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
338 "psrlw $8, %%mm7\n" // L4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
339 "movq %%mm4, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
340 "punpcklwd %%mm3, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
341 "pmullw %15, %%mm7\n" // lum4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
342 "punpckhwd %%mm3, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
343
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
344 "movq %%mm4, (%3)\n" // write row1
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
345 "movq %%mm5, 8(%3)\n" // write row1
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
346
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
347 "movq %%mm6, %%mm4\n" // Lum3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
348 "paddw %%mm0, %%mm6\n" // Lum3 +blue
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
349
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
350 "movq %%mm4, %%mm5\n" // Lum3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
351 "paddw %%mm1, %%mm4\n" // Lum3 +red
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
352 "paddw %%mm2, %%mm5\n" // Lum3 +green
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
353 "psraw $6, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
354 "movq %%mm7, %%mm3\n" // Lum4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
355 "psraw $6, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
356 "paddw %%mm0, %%mm7\n" // Lum4 +blue
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
357 "psraw $6, %%mm6\n" // Lum3 +blue
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
358 "movq %%mm3, %%mm0\n" // Lum4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
359 "packuswb %%mm4, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
360 "paddw %%mm1, %%mm3\n" // Lum4 +red
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
361 "packuswb %%mm5, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
362 "paddw %%mm2, %%mm0\n" // Lum4 +green
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
363 "packuswb %%mm6, %%mm6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
364 "punpcklbw %%mm4, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
365 "punpcklbw %%mm5, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
366 "punpcklbw %%mm6, %%mm6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
367 "psllw $3, %%mm5\n" // GREEN 3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
368 "pand %16, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
369 "psraw $6, %%mm3\n" // psr 6
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
370 "psraw $6, %%mm0\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
371 "pand %16, %%mm6\n" // BLUE
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
372 "pand %17, %%mm5\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
373 "psrlw $11, %%mm6\n" // BLUE 3
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
374 "por %%mm5, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
375 "psraw $6, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
376 "por %%mm6, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
377 "packuswb %%mm3, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
378 "packuswb %%mm0, %%mm0\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
379 "packuswb %%mm7, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
380 "punpcklbw %%mm3, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
381 "punpcklbw %%mm0, %%mm0\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
382 "punpcklbw %%mm7, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
383 "pand %16, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
384 "pand %16, %%mm7\n" // BLUE
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
385 "psllw $3, %%mm0\n" // GREEN 4
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
386 "psrlw $11, %%mm7\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
387 "pand %17, %%mm0\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
388 "por %%mm7, %%mm3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
389 "por %%mm0, %%mm3\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
390
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
391 "movq %%mm4, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
392
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
393 "punpcklwd %%mm3, %%mm4\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
394 "punpckhwd %%mm3, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
395
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
396 "movq %%mm4, (%5)\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
397 "movq %%mm5, 8(%5)\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
398
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
399 "addl $8, %6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
400 "addl $8, %2\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
401 "addl $4, (%%esp)\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
402 "addl $4, %1\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
403 "cmpl %4, %6\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
404 "leal 16(%3), %3\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
405 "leal 16(%5),%5\n" // row2+16
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
406
4046
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
407 "jl 1b\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
408 "addl %4, %2\n" // lum += cols
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
409 "addl %8, %3\n" // row1+= mod
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
410 "addl %8, %5\n" // row2+= mod
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
411 "movl $0, %6\n" // x=0
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
412 "cmpl %7, %2\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
413 "jl 1b\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
414 "addl $4, %%esp\n" // get rid of the stack slot we reserved.
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
415 "emms\n"
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
416 :
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
417 : "m" (cr), "r"(cb),"r"(lum),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
418 "r"(row1),"r"(cols),"r"(row2),"m"(x),"m"(y),"m"(mod),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
419 "m"(MMX_0080w),"m"(MMX_Ugrn565),"m"(MMX_Ublu5x5),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
420 "m"(MMX_00FFw),"m"(MMX_Vgrn565),"m"(MMX_Vred5x5),
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
421 "m"(MMX_Ycoeff),"m"(MMX_red565),"m"(MMX_grn565)
3a9e60224efe Cleaned up tabs.
Ryan C. Gordon <icculus@icculus.org>
parents: 4045
diff changeset
422 );
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
423 }
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
424
4047
810c6f4ab7aa Merged r3207:3208 from trunk/SDL: *INDENT-OFF* for inline asm.
Ryan C. Gordon <icculus@icculus.org>
parents: 4046
diff changeset
425 /* *INDENT-ON* */
810c6f4ab7aa Merged r3207:3208 from trunk/SDL: *INDENT-OFF* for inline asm.
Ryan C. Gordon <icculus@icculus.org>
parents: 4046
diff changeset
426
4045
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
427 #endif /* GCC3 i386 inline assembly */
f420bba13676 GCC inline asm for MMX YUV processing no longer has textrels and now works when
Ryan C. Gordon <icculus@icculus.org>
parents: 1413
diff changeset
428