annotate src/video/SDL_yuv_mmx.c @ 1555:780fd5b61df1

Fixed bug #89 Date: Sun, 23 Oct 2005 16:39:03 +0200 From: "A. Schmid" <sahib@phreaker.net> Subject: [SDL] no software surfaces with svgalib driver? Hi, I noticed that the SDL (1.2.9) svgalib driver only makes use of linear addressable (framebuffer) video modes. On older systems (like one of mine), linear addressable modes are often not available. Especially for cards with VESA VBE < 2.0 the svgalib vesa driver is unusable, since VESA only supports framebuffering for VBE 2.0 and later. The changes necessary to add support for software surfaces seem to be relatively small. I only had to hack src/video/svga/SDL_svgavideo.c (see attached patch). The code worked fine for me, but it is no more than a proof of concept and should be reviewed (probably has a memory leak when switching modes). It also uses the vgagl library (included in the svgalib package) and needs to be linked against it. -Alex
author Sam Lantinga <slouken@libsdl.org>
date Sun, 19 Mar 2006 12:05:16 +0000
parents 40edc79b0926
children 782fd950bd46 c121d94672cb f420bba13676
rev   line source
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
1 /*
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
2 SDL - Simple DirectMedia Layer
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
3 Copyright (C) 1997-2006 Sam Lantinga
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
5 This library is free software; you can redistribute it and/or
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
6 modify it under the terms of the GNU Lesser General Public
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
7 License as published by the Free Software Foundation; either
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
8 version 2.1 of the License, or (at your option) any later version.
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
9
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
10 This library is distributed in the hope that it will be useful,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
11 but WITHOUT ANY WARRANTY; without even the implied warranty of
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
12 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
13 Lesser General Public License for more details.
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
14
1312
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
15 You should have received a copy of the GNU Lesser General Public
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
16 License along with this library; if not, write to the Free Software
c9b51268668f Updated copyright information and removed rcs id lines (problematic in branch merges)
Sam Lantinga <slouken@libsdl.org>
parents: 1148
diff changeset
17 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
18
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
19 Sam Lantinga
252
e8157fcb3114 Updated the source with the correct e-mail address
Sam Lantinga <slouken@libsdl.org>
parents: 0
diff changeset
20 slouken@libsdl.org
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
21 */
1402
d910939febfa Use consistent identifiers for the various platforms we support.
Sam Lantinga <slouken@libsdl.org>
parents: 1361
diff changeset
22 #include "SDL_config.h"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
23
1413
Sam Lantinga <slouken@libsdl.org>
parents: 1407
diff changeset
24 #if 0 /* FIXME: This code needs to be rewritten to reference the static data using relocatable addresses (e.g. http://www.gentoo.org/proj/en/hardened/pic-fix-guide.xml or http://nasm.sourceforge.net/doc/html/nasmdoc8.html#section-8.2) This code currently breaks on systems with readonly text segments (hardened Linux / Intel Mac) */
1402
d910939febfa Use consistent identifiers for the various platforms we support.
Sam Lantinga <slouken@libsdl.org>
parents: 1361
diff changeset
25 #if defined(__GNUC__) && defined(__i386__) && SDL_ASSEMBLY_ROUTINES
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
26
1407
0c6941483cc6 Whoops, forgot to check in this fix
Sam Lantinga <slouken@libsdl.org>
parents: 1402
diff changeset
27 #include "SDL_stdinc.h"
0c6941483cc6 Whoops, forgot to check in this fix
Sam Lantinga <slouken@libsdl.org>
parents: 1402
diff changeset
28
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
29 #define ASM_ARRAY(x) x[] __asm__("_" #x) __attribute__((used))
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
30
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
31 static unsigned int ASM_ARRAY(MMX_0080w) = {0x00800080, 0x00800080};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
32 static unsigned int ASM_ARRAY(MMX_00FFw) = {0x00ff00ff, 0x00ff00ff};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
33 static unsigned int ASM_ARRAY(MMX_FF00w) = {0xff00ff00, 0xff00ff00};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
34
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
35 static unsigned short ASM_ARRAY(MMX_Ycoeff) = {0x004a, 0x004a, 0x004a, 0x004a};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
36
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
37 static unsigned short ASM_ARRAY(MMX_UbluRGB) = {0x0072, 0x0072, 0x0072, 0x0072};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
38 static unsigned short ASM_ARRAY(MMX_VredRGB) = {0x0059, 0x0059, 0x0059, 0x0059};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
39 static unsigned short ASM_ARRAY(MMX_UgrnRGB) = {0xffea, 0xffea, 0xffea, 0xffea};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
40 static unsigned short ASM_ARRAY(MMX_VgrnRGB) = {0xffd2, 0xffd2, 0xffd2, 0xffd2};
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
41
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
42 static unsigned short ASM_ARRAY(MMX_Ublu5x5) = {0x0081, 0x0081, 0x0081, 0x0081};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
43 static unsigned short ASM_ARRAY(MMX_Vred5x5) = {0x0066, 0x0066, 0x0066, 0x0066};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
44 static unsigned short ASM_ARRAY(MMX_Ugrn555) = {0xffe7, 0xffe7, 0xffe7, 0xffe7};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
45 static unsigned short ASM_ARRAY(MMX_Vgrn555) = {0xffcc, 0xffcc, 0xffcc, 0xffcc};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
46 static unsigned short ASM_ARRAY(MMX_Ugrn565) = {0xffe8, 0xffe8, 0xffe8, 0xffe8};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
47 static unsigned short ASM_ARRAY(MMX_Vgrn565) = {0xffcd, 0xffcd, 0xffcd, 0xffcd};
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
48
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
49 static unsigned short ASM_ARRAY(MMX_red555) = {0x7c00, 0x7c00, 0x7c00, 0x7c00};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
50 static unsigned short ASM_ARRAY(MMX_red565) = {0xf800, 0xf800, 0xf800, 0xf800};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
51 static unsigned short ASM_ARRAY(MMX_grn555) = {0x03e0, 0x03e0, 0x03e0, 0x03e0};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
52 static unsigned short ASM_ARRAY(MMX_grn565) = {0x07e0, 0x07e0, 0x07e0, 0x07e0};
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
53 static unsigned short ASM_ARRAY(MMX_blu5x5) = {0x001f, 0x001f, 0x001f, 0x001f};
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
54
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
55 /**
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
56 This MMX assembler is my first assembler/MMX program ever.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
57 Thus it maybe buggy.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
58 Send patches to:
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
59 mvogt@rhrk.uni-kl.de
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
60
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
61 After it worked fine I have "obfuscated" the code a bit to have
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
62 more parallism in the MMX units. This means I moved
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
63 initilisation around and delayed other instruction.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
64 Performance measurement did not show that this brought any advantage
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
65 but in theory it _should_ be faster this way.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
66
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
67 The overall performanve gain to the C based dither was 30%-40%.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
68 The MMX routine calculates 256bit=8RGB values in each cycle
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
69 (4 for row1 & 4 for row2)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
70
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
71 The red/green/blue.. coefficents are taken from the mpeg_play
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
72 player. They look nice, but I dont know if you can have
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
73 better values, to avoid integer rounding errors.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
74
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
75
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
76 IMPORTANT:
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
77 ==========
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
78
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
79 It is a requirement that the cr/cb/lum are 8 byte aligned and
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
80 the out are 16byte aligned or you will/may get segfaults
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
81
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
82 */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
83
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
84 void ColorRGBDitherYV12MMX1X( int *colortab, Uint32 *rgb_2_pix,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
85 unsigned char *lum, unsigned char *cr,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
86 unsigned char *cb, unsigned char *out,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
87 int rows, int cols, int mod )
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
88 {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
89 Uint32 *row1;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
90 Uint32 *row2;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
91
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
92 unsigned char* y = lum +cols*rows; // Pointer to the end
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
93 int x=0;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
94 row1 = (Uint32 *)out; // 32 bit target
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
95 row2 = (Uint32 *)out+cols+mod; // start of second row
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
96 mod = (mod+cols+mod)*4; // increment for row1 in byte
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
97
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
98 __asm__ __volatile__ (
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
99 /* We don't really care about PIC - the code should be rewritten to use
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
100 relative addressing for the static tables, so right now we take the
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
101 COW hit on the pages this code resides. Big deal.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
102 This spill is just to reduce register pressure in the PIC case. */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
103 "pushl %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
104 "movl %0, %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
105
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
106 ".align 8\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
107 "1:\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
108
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
109 // create Cr (result in mm1)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
110 "movd (%%ebx), %%mm1\n" // 0 0 0 0 v3 v2 v1 v0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
111 "pxor %%mm7,%%mm7\n" // 00 00 00 00 00 00 00 00
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
112 "movd (%2), %%mm2\n" // 0 0 0 0 l3 l2 l1 l0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
113 "punpcklbw %%mm7,%%mm1\n" // 0 v3 0 v2 00 v1 00 v0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
114 "punpckldq %%mm1,%%mm1\n" // 00 v1 00 v0 00 v1 00 v0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
115 "psubw _MMX_0080w,%%mm1\n" // mm1-128:r1 r1 r0 r0 r1 r1 r0 r0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
116
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
117 // create Cr_g (result in mm0)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
118 "movq %%mm1,%%mm0\n" // r1 r1 r0 r0 r1 r1 r0 r0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
119 "pmullw _MMX_VgrnRGB,%%mm0\n"// red*-46dec=0.7136*64
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
120 "pmullw _MMX_VredRGB,%%mm1\n"// red*89dec=1.4013*64
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
121 "psraw $6, %%mm0\n" // red=red/64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
122 "psraw $6, %%mm1\n" // red=red/64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
123
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
124 // create L1 L2 (result in mm2,mm4)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
125 // L2=lum+cols
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
126 "movq (%2,%4),%%mm3\n" // 0 0 0 0 L3 L2 L1 L0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
127 "punpckldq %%mm3,%%mm2\n" // L3 L2 L1 L0 l3 l2 l1 l0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
128 "movq %%mm2,%%mm4\n" // L3 L2 L1 L0 l3 l2 l1 l0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
129 "pand _MMX_FF00w,%%mm2\n" // L3 0 L1 0 l3 0 l1 0
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
130 "pand _MMX_00FFw,%%mm4\n" // 0 L2 0 L0 0 l2 0 l0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
131 "psrlw $8,%%mm2\n" // 0 L3 0 L1 0 l3 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
132
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
133 // create R (result in mm6)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
134 "movq %%mm2,%%mm5\n" // 0 L3 0 L1 0 l3 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
135 "movq %%mm4,%%mm6\n" // 0 L2 0 L0 0 l2 0 l0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
136 "paddsw %%mm1, %%mm5\n" // lum1+red:x R3 x R1 x r3 x r1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
137 "paddsw %%mm1, %%mm6\n" // lum1+red:x R2 x R0 x r2 x r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
138 "packuswb %%mm5,%%mm5\n" // R3 R1 r3 r1 R3 R1 r3 r1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
139 "packuswb %%mm6,%%mm6\n" // R2 R0 r2 r0 R2 R0 r2 r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
140 "pxor %%mm7,%%mm7\n" // 00 00 00 00 00 00 00 00
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
141 "punpcklbw %%mm5,%%mm6\n" // R3 R2 R1 R0 r3 r2 r1 r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
142
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
143 // create Cb (result in mm1)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
144 "movd (%1), %%mm1\n" // 0 0 0 0 u3 u2 u1 u0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
145 "punpcklbw %%mm7,%%mm1\n" // 0 u3 0 u2 00 u1 00 u0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
146 "punpckldq %%mm1,%%mm1\n" // 00 u1 00 u0 00 u1 00 u0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
147 "psubw _MMX_0080w,%%mm1\n" // mm1-128:u1 u1 u0 u0 u1 u1 u0 u0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
148 // create Cb_g (result in mm5)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
149 "movq %%mm1,%%mm5\n" // u1 u1 u0 u0 u1 u1 u0 u0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
150 "pmullw _MMX_UgrnRGB,%%mm5\n" // blue*-109dec=1.7129*64
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
151 "pmullw _MMX_UbluRGB,%%mm1\n" // blue*114dec=1.78125*64
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
152 "psraw $6, %%mm5\n" // blue=red/64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
153 "psraw $6, %%mm1\n" // blue=blue/64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
154
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
155 // create G (result in mm7)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
156 "movq %%mm2,%%mm3\n" // 0 L3 0 L1 0 l3 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
157 "movq %%mm4,%%mm7\n" // 0 L2 0 L0 0 l2 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
158 "paddsw %%mm5, %%mm3\n" // lum1+Cb_g:x G3t x G1t x g3t x g1t
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
159 "paddsw %%mm5, %%mm7\n" // lum1+Cb_g:x G2t x G0t x g2t x g0t
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
160 "paddsw %%mm0, %%mm3\n" // lum1+Cr_g:x G3 x G1 x g3 x g1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
161 "paddsw %%mm0, %%mm7\n" // lum1+blue:x G2 x G0 x g2 x g0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
162 "packuswb %%mm3,%%mm3\n" // G3 G1 g3 g1 G3 G1 g3 g1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
163 "packuswb %%mm7,%%mm7\n" // G2 G0 g2 g0 G2 G0 g2 g0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
164 "punpcklbw %%mm3,%%mm7\n" // G3 G2 G1 G0 g3 g2 g1 g0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
165
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
166 // create B (result in mm5)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
167 "movq %%mm2,%%mm3\n" // 0 L3 0 L1 0 l3 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
168 "movq %%mm4,%%mm5\n" // 0 L2 0 L0 0 l2 0 l1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
169 "paddsw %%mm1, %%mm3\n" // lum1+blue:x B3 x B1 x b3 x b1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
170 "paddsw %%mm1, %%mm5\n" // lum1+blue:x B2 x B0 x b2 x b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
171 "packuswb %%mm3,%%mm3\n" // B3 B1 b3 b1 B3 B1 b3 b1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
172 "packuswb %%mm5,%%mm5\n" // B2 B0 b2 b0 B2 B0 b2 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
173 "punpcklbw %%mm3,%%mm5\n" // B3 B2 B1 B0 b3 b2 b1 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
174
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
175 // fill destination row1 (needed are mm6=Rr,mm7=Gg,mm5=Bb)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
176
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
177 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
178 "pxor %%mm4,%%mm4\n" // 0 0 0 0 0 0 0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
179 "movq %%mm6,%%mm1\n" // R3 R2 R1 R0 r3 r2 r1 r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
180 "movq %%mm5,%%mm3\n" // B3 B2 B1 B0 b3 b2 b1 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
181 // process lower lum
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
182 "punpcklbw %%mm4,%%mm1\n" // 0 r3 0 r2 0 r1 0 r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
183 "punpcklbw %%mm4,%%mm3\n" // 0 b3 0 b2 0 b1 0 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
184 "movq %%mm1,%%mm2\n" // 0 r3 0 r2 0 r1 0 r0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
185 "movq %%mm3,%%mm0\n" // 0 b3 0 b2 0 b1 0 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
186 "punpcklwd %%mm1,%%mm3\n" // 0 r1 0 b1 0 r0 0 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
187 "punpckhwd %%mm2,%%mm0\n" // 0 r3 0 b3 0 r2 0 b2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
188
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
189 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
190 "movq %%mm7,%%mm1\n" // G3 G2 G1 G0 g3 g2 g1 g0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
191 "punpcklbw %%mm1,%%mm2\n" // g3 0 g2 0 g1 0 g0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
192 "punpcklwd %%mm4,%%mm2\n" // 0 0 g1 0 0 0 g0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
193 "por %%mm3, %%mm2\n" // 0 r1 g1 b1 0 r0 g0 b0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
194 "movq %%mm2,(%3)\n" // wrote out ! row1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
195
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
196 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
197 "punpcklbw %%mm1,%%mm4\n" // g3 0 g2 0 g1 0 g0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
198 "punpckhwd %%mm2,%%mm4\n" // 0 0 g3 0 0 0 g2 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
199 "por %%mm0, %%mm4\n" // 0 r3 g3 b3 0 r2 g2 b2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
200 "movq %%mm4,8(%3)\n" // wrote out ! row1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
201
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
202 // fill destination row2 (needed are mm6=Rr,mm7=Gg,mm5=Bb)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
203 // this can be done "destructive"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
204 "pxor %%mm2,%%mm2\n" // 0 0 0 0 0 0 0 0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
205 "punpckhbw %%mm2,%%mm6\n" // 0 R3 0 R2 0 R1 0 R0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
206 "punpckhbw %%mm1,%%mm5\n" // G3 B3 G2 B2 G1 B1 G0 B0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
207 "movq %%mm5,%%mm1\n" // G3 B3 G2 B2 G1 B1 G0 B0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
208 "punpcklwd %%mm6,%%mm1\n" // 0 R1 G1 B1 0 R0 G0 B0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
209 "movq %%mm1,(%5)\n" // wrote out ! row2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
210 "punpckhwd %%mm6,%%mm5\n" // 0 R3 G3 B3 0 R2 G2 B2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
211 "movq %%mm5,8(%5)\n" // wrote out ! row2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
212
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
213 "addl $4,%2\n" // lum+4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
214 "leal 16(%3),%3\n" // row1+16
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
215 "leal 16(%5),%5\n" // row2+16
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
216 "addl $2, %%ebx\n" // cr+2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
217 "addl $2, %1\n" // cb+2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
218
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
219 "addl $4,%6\n" // x+4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
220 "cmpl %4,%6\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
221
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
222 "jl 1b\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
223 "addl %4, %2\n" // lum += cols
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
224 "addl %8, %3\n" // row1+= mod
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
225 "addl %8, %5\n" // row2+= mod
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
226 "movl $0, %6\n" // x=0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
227 "cmpl %7, %2\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
228 "jl 1b\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
229 "emms\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
230 "popl %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
231 :
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
232 : "m" (cr), "r"(cb),"r"(lum),
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
233 "r"(row1),"r"(cols),"r"(row2),"m"(x),"m"(y),"m"(mod));
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
234 }
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
235
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
236 void Color565DitherYV12MMX1X( int *colortab, Uint32 *rgb_2_pix,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
237 unsigned char *lum, unsigned char *cr,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
238 unsigned char *cb, unsigned char *out,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
239 int rows, int cols, int mod )
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
240 {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
241 Uint16 *row1;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
242 Uint16 *row2;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
243
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
244 unsigned char* y = lum +cols*rows; /* Pointer to the end */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
245 int x=0;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
246 row1 = (Uint16 *)out; /* 16 bit target */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
247 row2 = (Uint16 *)out+cols+mod; /* start of second row */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
248 mod = (mod+cols+mod)*2; /* increment for row1 in byte */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
249
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
250
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
251 __asm__ __volatile__(
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
252 "pushl %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
253 "movl %0, %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
254
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
255 ".align 8\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
256 "1:\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
257 "movd (%1), %%mm0\n" // 4 Cb 0 0 0 0 u3 u2 u1 u0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
258 "pxor %%mm7, %%mm7\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
259 "movd (%%ebx), %%mm1\n" // 4 Cr 0 0 0 0 v3 v2 v1 v0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
260 "punpcklbw %%mm7, %%mm0\n" // 4 W cb 0 u3 0 u2 0 u1 0 u0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
261 "punpcklbw %%mm7, %%mm1\n" // 4 W cr 0 v3 0 v2 0 v1 0 v0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
262 "psubw _MMX_0080w, %%mm0\n"
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
263 "psubw _MMX_0080w, %%mm1\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
264 "movq %%mm0, %%mm2\n" // Cb 0 u3 0 u2 0 u1 0 u0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
265 "movq %%mm1, %%mm3\n" // Cr
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
266 "pmullw _MMX_Ugrn565, %%mm2\n" // Cb2green 0 R3 0 R2 0 R1 0 R0
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
267 "movq (%2), %%mm6\n" // L1 l7 L6 L5 L4 L3 L2 L1 L0
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
268 "pmullw _MMX_Ublu5x5, %%mm0\n" // Cb2blue
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
269 "pand _MMX_00FFw, %%mm6\n" // L1 00 L6 00 L4 00 L2 00 L0
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
270 "pmullw _MMX_Vgrn565, %%mm3\n" // Cr2green
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
271 "movq (%2), %%mm7\n" // L2
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
272 "pmullw _MMX_Vred5x5, %%mm1\n" // Cr2red
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
273 "psrlw $8, %%mm7\n" // L2 00 L7 00 L5 00 L3 00 L1
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
274 "pmullw _MMX_Ycoeff, %%mm6\n" // lum1
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
275 "paddw %%mm3, %%mm2\n" // Cb2green + Cr2green == green
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
276 "pmullw _MMX_Ycoeff, %%mm7\n" // lum2
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
277
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
278 "movq %%mm6, %%mm4\n" // lum1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
279 "paddw %%mm0, %%mm6\n" // lum1 +blue 00 B6 00 B4 00 B2 00 B0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
280 "movq %%mm4, %%mm5\n" // lum1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
281 "paddw %%mm1, %%mm4\n" // lum1 +red 00 R6 00 R4 00 R2 00 R0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
282 "paddw %%mm2, %%mm5\n" // lum1 +green 00 G6 00 G4 00 G2 00 G0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
283 "psraw $6, %%mm4\n" // R1 0 .. 64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
284 "movq %%mm7, %%mm3\n" // lum2 00 L7 00 L5 00 L3 00 L1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
285 "psraw $6, %%mm5\n" // G1 - .. +
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
286 "paddw %%mm0, %%mm7\n" // Lum2 +blue 00 B7 00 B5 00 B3 00 B1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
287 "psraw $6, %%mm6\n" // B1 0 .. 64
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
288 "packuswb %%mm4, %%mm4\n" // R1 R1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
289 "packuswb %%mm5, %%mm5\n" // G1 G1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
290 "packuswb %%mm6, %%mm6\n" // B1 B1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
291 "punpcklbw %%mm4, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
292 "punpcklbw %%mm5, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
293
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
294 "pand _MMX_red565, %%mm4\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
295 "psllw $3, %%mm5\n" // GREEN 1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
296 "punpcklbw %%mm6, %%mm6\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
297 "pand _MMX_grn565, %%mm5\n"
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
298 "pand _MMX_red565, %%mm6\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
299 "por %%mm5, %%mm4\n" //
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
300 "psrlw $11, %%mm6\n" // BLUE 1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
301 "movq %%mm3, %%mm5\n" // lum2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
302 "paddw %%mm1, %%mm3\n" // lum2 +red 00 R7 00 R5 00 R3 00 R1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
303 "paddw %%mm2, %%mm5\n" // lum2 +green 00 G7 00 G5 00 G3 00 G1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
304 "psraw $6, %%mm3\n" // R2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
305 "por %%mm6, %%mm4\n" // MM4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
306 "psraw $6, %%mm5\n" // G2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
307 "movq (%2, %4), %%mm6\n" // L3 load lum2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
308 "psraw $6, %%mm7\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
309 "packuswb %%mm3, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
310 "packuswb %%mm5, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
311 "packuswb %%mm7, %%mm7\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
312 "pand _MMX_00FFw, %%mm6\n" // L3
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
313 "punpcklbw %%mm3, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
314 "punpcklbw %%mm5, %%mm5\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
315 "pmullw _MMX_Ycoeff, %%mm6\n" // lum3
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
316 "punpcklbw %%mm7, %%mm7\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
317 "psllw $3, %%mm5\n" // GREEN 2
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
318 "pand _MMX_red565, %%mm7\n"
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
319 "pand _MMX_red565, %%mm3\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
320 "psrlw $11, %%mm7\n" // BLUE 2
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
321 "pand _MMX_grn565, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
322 "por %%mm7, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
323 "movq (%2,%4), %%mm7\n" // L4 load lum2
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
324 "por %%mm5, %%mm3\n" //
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
325 "psrlw $8, %%mm7\n" // L4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
326 "movq %%mm4, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
327 "punpcklwd %%mm3, %%mm4\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
328 "pmullw _MMX_Ycoeff, %%mm7\n" // lum4
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
329 "punpckhwd %%mm3, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
330
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
331 "movq %%mm4, (%3)\n" // write row1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
332 "movq %%mm5, 8(%3)\n" // write row1
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
333
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
334 "movq %%mm6, %%mm4\n" // Lum3
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
335 "paddw %%mm0, %%mm6\n" // Lum3 +blue
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
336
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
337 "movq %%mm4, %%mm5\n" // Lum3
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
338 "paddw %%mm1, %%mm4\n" // Lum3 +red
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
339 "paddw %%mm2, %%mm5\n" // Lum3 +green
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
340 "psraw $6, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
341 "movq %%mm7, %%mm3\n" // Lum4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
342 "psraw $6, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
343 "paddw %%mm0, %%mm7\n" // Lum4 +blue
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
344 "psraw $6, %%mm6\n" // Lum3 +blue
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
345 "movq %%mm3, %%mm0\n" // Lum4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
346 "packuswb %%mm4, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
347 "paddw %%mm1, %%mm3\n" // Lum4 +red
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
348 "packuswb %%mm5, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
349 "paddw %%mm2, %%mm0\n" // Lum4 +green
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
350 "packuswb %%mm6, %%mm6\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
351 "punpcklbw %%mm4, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
352 "punpcklbw %%mm5, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
353 "punpcklbw %%mm6, %%mm6\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
354 "psllw $3, %%mm5\n" // GREEN 3
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
355 "pand _MMX_red565, %%mm4\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
356 "psraw $6, %%mm3\n" // psr 6
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
357 "psraw $6, %%mm0\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
358 "pand _MMX_red565, %%mm6\n" // BLUE
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
359 "pand _MMX_grn565, %%mm5\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
360 "psrlw $11, %%mm6\n" // BLUE 3
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
361 "por %%mm5, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
362 "psraw $6, %%mm7\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
363 "por %%mm6, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
364 "packuswb %%mm3, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
365 "packuswb %%mm0, %%mm0\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
366 "packuswb %%mm7, %%mm7\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
367 "punpcklbw %%mm3, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
368 "punpcklbw %%mm0, %%mm0\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
369 "punpcklbw %%mm7, %%mm7\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
370 "pand _MMX_red565, %%mm3\n"
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
371 "pand _MMX_red565, %%mm7\n" // BLUE
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
372 "psllw $3, %%mm0\n" // GREEN 4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
373 "psrlw $11, %%mm7\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
374 "pand _MMX_grn565, %%mm0\n"
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
375 "por %%mm7, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
376 "por %%mm0, %%mm3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
377
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
378 "movq %%mm4, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
379
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
380 "punpcklwd %%mm3, %%mm4\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
381 "punpckhwd %%mm3, %%mm5\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
382
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
383 "movq %%mm4, (%5)\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
384 "movq %%mm5, 8(%5)\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
385
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
386 "addl $8, %6\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
387 "addl $8, %2\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
388 "addl $4, %%ebx\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
389 "addl $4, %1\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
390 "cmpl %4, %6\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
391 "leal 16(%3), %3\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
392 "leal 16(%5),%5\n" // row2+16
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
393
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
394
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
395 "jl 1b\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
396 "addl %4, %2\n" // lum += cols
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
397 "addl %8, %3\n" // row1+= mod
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
398 "addl %8, %5\n" // row2+= mod
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
399 "movl $0, %6\n" // x=0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
400 "cmpl %7, %2\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
401 "jl 1b\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
402 "emms\n"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
403 "popl %%ebx\n"
1038
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
404 :
29d7db09776e Ugly hack to make this work with gcc 2.x and 3.x
Sam Lantinga <slouken@libsdl.org>
parents: 949
diff changeset
405 :"m" (cr), "r"(cb),"r"(lum),
1148
63fb2da89a4b Patched inline assembly to compile on gcc 4.0.1. Details are here:
Ryan C. Gordon <icculus@icculus.org>
parents: 1038
diff changeset
406 "r"(row1),"r"(cols),"r"(row2),"m"(x),"m"(y),"m"(mod));
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
407 }
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
408
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
409 #endif /* GCC i386 inline assembly */
1413
Sam Lantinga <slouken@libsdl.org>
parents: 1407
diff changeset
410 #endif /* 0 */