/[gxemul]/trunk/src/bintrans_i386.c
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/src/bintrans_i386.c

Parent Directory Parent Directory | Revision Log Revision Log


Revision 12 - (show annotations)
Mon Oct 8 16:18:38 2007 UTC (16 years, 5 months ago) by dpavlin
File MIME type: text/plain
File size: 85064 byte(s)
++ trunk/HISTORY	(local)
$Id: HISTORY,v 1.905 2005/08/16 09:16:24 debug Exp $
20050628	Continuing the work on the ARM translation engine. end_of_page
		works. Experimenting with load/store translation caches
		(virtual -> physical -> host).
20050629	More ARM stuff (memory access translation cache, mostly). This
		might break a lot of stuff elsewhere, probably some MIPS-
		related translation things.
20050630	Many load/stores are now automatically generated and included
		into cpu_arm_instr.c; 1024 functions in total (!).
		Fixes based on feedback from Alec Voropay: only print 8 hex
		digits instead of 16 in some cases when emulating 32-bit
		machines; similar 8 vs 16 digit fix for breakpoint addresses;
		4Kc has 16 TLB entries, not 48; the MIPS config select1
		register is now printed with "reg ,0".
		Also changing many other occurances of 16 vs 8 digit output.
		Adding cache associativity fields to mips_cpu_types.h; updating
		some other cache fields; making the output of
		mips_cpu_dumpinfo() look nicer.
		Generalizing the bintrans stuff for device accesses to also
		work with the new translation system. (This might also break
		some MIPS things.)
		Adding multi-load/store instructions to the ARM disassembler
		and the translator, and some optimizations of various kinds.
20050701	Adding a simple dev_disk (it can read/write sectors from
		disk images).
20050712	Adding dev_ether (a simple ethernet send/receive device).
		Debugger command "ninstrs" for toggling show_nr_of_instructions
		during runtime.
		Removing the framebuffer logo.
20050713	Continuing on dev_ether.
		Adding a dummy cpu_alpha (again).
20050714	More work on cpu_alpha.
20050715	More work on cpu_alpha. Many instructions work, enough to run
		a simple framebuffer fill test (similar to the ARM test).
20050716	More Alpha stuff.
20050717	Minor updates (Alpha stuff).
20050718	Minor updates (Alpha stuff).
20050719	Generalizing some Alpha instructions.
20050720	More Alpha-related updates.
20050721	Continuing on cpu_alpha. Importing rpb.h from NetBSD/alpha.
20050722	Alpha-related updates: userland stuff (Hello World using
		write() compiled statically for FreeBSD/Alpha runs fine), and
		more instructions are now implemented.
20050723	Fixing ldq_u and stq_u.
		Adding more instructions (conditional moves, masks, extracts,
		shifts).
20050724	More FreeBSD/Alpha userland stuff, and adding some more
		instructions (inserts).
20050725	Continuing on the Alpha stuff. (Adding dummy ldt/stt.)
		Adding a -A command line option to turn off alignment checks
		in some cases (for translated code).
		Trying to remove the old bintrans code which updated the pc
		and nr_of_executed_instructions for every instruction.
20050726	Making another attempt att removing the pc/nr of instructions
		code. This time it worked, huge performance increase for
		artificial test code, but performance loss for real-world
		code :-( so I'm scrapping that code for now.
		Tiny performance increase on Alpha (by using ret instead of
		jmp, to play nice with the Alpha's branch prediction) for the
		old MIPS bintrans backend.
20050727	Various minor fixes and cleanups.
20050728	Switching from a 2-level virtual to host/physical translation
		system for ARM emulation, to a 1-level translation.
		Trying to switch from 2-level to 1-level for the MIPS bintrans
		system as well (Alpha only, so far), but there is at least one
		problem: caches and/or how they work with device mappings.
20050730	Doing the 2-level to 1-level conversion for the i386 backend.
		The cache/device bug is still there for R2K/3K :(
		Various other minor updates (Malta etc).
		The mc146818 clock now updates the UIP bit in a way which works
		better with Linux for at least sgimips and Malta emulation.
		Beginning the work on refactoring the dyntrans system.
20050731	Continuing the dyntrans refactoring.
		Fixing a small but serious host alignment bug in memory_rw.
		Adding support for big-endian load/stores to the i386 bintrans
		backend.
		Another minor i386 bintrans backend update: stores from the
		zero register are now one (or two) loads shorter.
		The slt and sltu instructions were incorrectly implemented for
		the i386 backend; only using them for 32-bit mode for now.
20050801	Continuing the dyntrans refactoring.
		Cleanup of the ns16550 serial controller (removing unnecessary
		code).
		Bugfix (memory corruption bug) in dev_gt, and a patch/hack from
		Alec Voropay for Linux/Malta.
20050802	More cleanup/refactoring of the dyntrans subsystem: adding
		phys_page pointers to the lookup tables, for quick jumps
		between translated pages.
		Better fix for the ns16550 device (but still no real FIFO
		functionality).
		Converting cpu_ppc to the new dyntrans system. This means that
		I will have to start from scratch with implementing each
		instruction, and figure out how to implement dual 64/32-bit
		modes etc.
		Removing the URISC CPU family, because it was useless.
20050803	When selecting a machine type, the main type can now be omitted
		if the subtype name is unique. (I.e. -E can be omitted.)
		Fixing a dyntrans/device update bug. (Writes to offset 0 of
		a device could sometimes go unnoticed.)
		Adding an experimental "instruction combination" hack for
		ARM for memset-like byte fill loops.
20050804	Minor progress on cpu_alpha and related things.
		Finally fixing the MIPS dmult/dmultu bugs.
		Fixing some minor TODOs.
20050805	Generalizing the 8259 PIC. It now also works with Cobalt
		and evbmips emulation, in addition to the x86 hack.
		Finally converting the ns16550 device to use devinit.
		Continuing the work on the dyntrans system. Thinking about
		how to add breakpoints.
20050806	More dyntrans updates. Breakpoints seem to work now.
20050807	Minor updates: cpu_alpha and related things; removing
		dev_malta (as it isn't used any more).
		Dyntrans: working on general "show trace tree" support.
		The trace tree stuff now works with both the old MIPS code and
		with newer dyntrans modes. :)
		Continuing on Alpha-related stuff (trying to get *BSD to boot
		a bit further, adding more instructions, etc).
20050808	Adding a dummy IA64 cpu family, and continuing the refactoring
		of the dyntrans system.
		Removing the regression test stuff, because it was more or
		less useless.
		Adding loadlinked/storeconditional type instructions to the
		Alpha emulation. (Needed for Linux/alpha. Not very well tested
		yet.)
20050809	The function call trace tree now prints a per-function nr of
		arguments. (Semi-meaningless, since that data isn't read yet
		from the ELFs; some hardcoded symbols such as memcpy() and
		strlen() work fine, though.)
		More dyntrans refactoring; taking out more of the things that
		are common to all cpu families.
20050810	Working on adding support for "dual mode" for PPC dyntrans
		(i.e. both 64-bit and 32-bit modes).
		(Re)adding some simple PPC instructions.
20050811	Adding a dummy M68K cpu family. The dyntrans system isn't ready
		for variable-length ISAs yet, so it's completely bogus so far.
		Re-adding more PPC instructions.
		Adding a hack to src/file.c which allows OpenBSD/mac68k a.out
		kernels to be loaded.
		Beginning to add PPC loads/stores. So far they only work in
		32-bit mode.
20050812	The configure file option "add_remote" now accepts symbolic
		host names, in addition to numeric IPv4 addresses.
		Re-adding more PPC instructions.
20050814	Continuing to port back more PPC instructions.
		Found and fixed the cache/device write-update bug for 32-bit
		MIPS bintrans. :-)
		Triggered a really weird and annoying bug in Compaq's C
		compiler; ccc sometimes outputs code which loads from an
		address _before_ checking whether the pointer was NULL or not.
		(I'm not sure how to handle this problem.)
20050815	Removing all of the old x86 instruction execution code; adding
		a new (dummy) dyntrans module for x86.
		Taking the first steps to extend the dyntrans system to support
		variable-length instructions.
		Slowly preparing for the next release.
20050816	Adding a dummy SPARC cpu module.
		Minor updates (documentation etc) for the release.

==============  RELEASE 0.3.5  ==============


1 /*
2 * Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
3 *
4 * Redistribution and use in source and binary forms, with or without
5 * modification, are permitted provided that the following conditions are met:
6 *
7 * 1. Redistributions of source code must retain the above copyright
8 * notice, this list of conditions and the following disclaimer.
9 * 2. Redistributions in binary form must reproduce the above copyright
10 * notice, this list of conditions and the following disclaimer in the
11 * documentation and/or other materials provided with the distribution.
12 * 3. The name of the author may not be used to endorse or promote products
13 * derived from this software without specific prior written permission.
14 *
15 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
16 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18 * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25 * SUCH DAMAGE.
26 *
27 *
28 * $Id: bintrans_i386.c,v 1.83 2005/07/31 08:47:57 debug Exp $
29 *
30 * i386 specific code for dynamic binary translation.
31 * See bintrans.c for more information. Included from bintrans.c.
32 *
33 * Translated code uses the following conventions at all time:
34 *
35 * esi points to the cpu struct
36 * edi lowest 32 bits of cpu->pc
37 * ebp contains cpu->bintrans_instructions_executed
38 */
39
40
41 struct cpu dummy_cpu;
42 struct mips_coproc dummy_coproc;
43 struct vth32_table dummy_vth32_table;
44
45
46 /*
47 * bintrans_host_cacheinvalidate()
48 *
49 * Invalidate the host's instruction cache. On i386, this isn't necessary,
50 * so this is an empty function.
51 */
52 static void bintrans_host_cacheinvalidate(unsigned char *p, size_t len)
53 {
54 /* Do nothing. */
55 }
56
57
58 /* offsetof (in stdarg.h) could possibly be used, but I'm not sure
59 if it will take care of the compiler problems... */
60
61 #define ofs_i (((size_t)&dummy_cpu.cd.mips.bintrans_instructions_executed) - ((size_t)&dummy_cpu))
62 #define ofs_pc (((size_t)&dummy_cpu.pc) - ((size_t)&dummy_cpu))
63 #define ofs_pc_last (((size_t)&dummy_cpu.cd.mips.pc_last) - ((size_t)&dummy_cpu))
64 #define ofs_tabl0 (((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - ((size_t)&dummy_cpu))
65 #define ofs_chunks ((size_t)&dummy_vth32_table.bintrans_chunks[0] - (size_t)&dummy_vth32_table)
66 #define ofs_chunkbase ((size_t)&dummy_cpu.cd.mips.chunk_base_address - (size_t)&dummy_cpu)
67 #define ofs_h_l (((size_t)&dummy_cpu.cd.mips.host_load) - ((size_t)&dummy_cpu))
68 #define ofs_h_s (((size_t)&dummy_cpu.cd.mips.host_store) - ((size_t)&dummy_cpu))
69
70
71 static void (*bintrans_runchunk)(struct cpu *, unsigned char *);
72 static void (*bintrans_jump_to_32bit_pc)(struct cpu *);
73 static void (*bintrans_load_32bit)(struct cpu *);
74 static void (*bintrans_store_32bit)(struct cpu *);
75
76
77 /*
78 * bintrans_write_quickjump():
79 */
80 static void bintrans_write_quickjump(struct memory *mem,
81 unsigned char *quickjump_code, uint32_t chunkoffset)
82 {
83 uint32_t i386_addr;
84 unsigned char *a = quickjump_code;
85
86 i386_addr = chunkoffset + (size_t)mem->translation_code_chunk_space;
87 i386_addr = i386_addr - ((size_t)a + 5);
88
89 /* printf("chunkoffset=%i, %08x %08x %i\n",
90 chunkoffset, i386_addr, a, ofs); */
91
92 *a++ = 0xe9;
93 *a++ = i386_addr;
94 *a++ = i386_addr >> 8;
95 *a++ = i386_addr >> 16;
96 *a++ = i386_addr >> 24;
97 }
98
99
100 /*
101 * bintrans_write_chunkreturn():
102 */
103 static void bintrans_write_chunkreturn(unsigned char **addrp)
104 {
105 unsigned char *a = *addrp;
106 *a++ = 0xc3; /* ret */
107 *addrp = a;
108 }
109
110
111 /*
112 * bintrans_write_chunkreturn_fail():
113 */
114 static void bintrans_write_chunkreturn_fail(unsigned char **addrp)
115 {
116 unsigned char *a = *addrp;
117
118 /* 81 cd 00 00 00 01 orl $0x1000000,%ebp */
119 *a++ = 0x81; *a++ = 0xcd;
120 *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x01; /* TODO: not hardcoded */
121
122 *a++ = 0xc3; /* ret */
123 *addrp = a;
124 }
125
126
127 /*
128 * bintrans_write_pc_inc():
129 */
130 static void bintrans_write_pc_inc(unsigned char **addrp)
131 {
132 unsigned char *a = *addrp;
133
134 /* 83 c7 04 add $0x4,%edi */
135 *a++ = 0x83; *a++ = 0xc7; *a++ = 4;
136
137 #if 0
138 if (!bintrans_32bit_only) {
139 int ofs;
140 /* 83 96 zz zz zz zz 00 adcl $0x0,zz(%esi) */
141 ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
142 ofs += 4;
143 *a++ = 0x83; *a++ = 0x96;
144 *a++ = ofs & 255;
145 *a++ = (ofs >> 8) & 255;
146 *a++ = (ofs >> 16) & 255;
147 *a++ = (ofs >> 24) & 255;
148 *a++ = 0;
149 }
150 #endif
151
152 /* 45 inc %ebp */
153 *a++ = 0x45;
154
155 *addrp = a;
156 }
157
158
159 /*
160 * load_pc_into_eax_edx():
161 */
162 static void load_pc_into_eax_edx(unsigned char **addrp)
163 {
164 unsigned char *a;
165 a = *addrp;
166
167 /* 89 f8 mov %edi,%eax */
168 *a++ = 0x89; *a++ = 0xf8;
169
170 #if 0
171 if (bintrans_32bit_only) {
172 /* 99 cltd */
173 *a++ = 0x99;
174 } else
175 #endif
176 {
177 int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
178 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
179 ofs += 4;
180 *a++ = 0x8b; *a++ = 0x96;
181 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
182 }
183
184 *addrp = a;
185 }
186
187
188 /*
189 * store_eax_edx_into_pc():
190 */
191 static void store_eax_edx_into_pc(unsigned char **addrp)
192 {
193 unsigned char *a;
194 int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
195 a = *addrp;
196
197 /* 89 c7 mov %eax,%edi */
198 *a++ = 0x89; *a++ = 0xc7;
199
200 /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
201 ofs += 4;
202 *a++ = 0x89; *a++ = 0x96;
203 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
204
205 *addrp = a;
206 }
207
208
209 /*
210 * load_into_eax_edx():
211 *
212 * Usage: load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
213 */
214 static void load_into_eax_edx(unsigned char **addrp, void *p)
215 {
216 unsigned char *a;
217 int ofs = (size_t)p - (size_t)&dummy_cpu;
218 a = *addrp;
219
220 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
221 *a++ = 0x8b; *a++ = 0x86;
222 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
223
224 #if 0
225 if (bintrans_32bit_only) {
226 /* 99 cltd */
227 *a++ = 0x99;
228 } else
229 #endif
230 {
231 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
232 ofs += 4;
233 *a++ = 0x8b; *a++ = 0x96;
234 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
235 }
236
237 *addrp = a;
238 }
239
240
241 /*
242 * load_into_eax_and_sign_extend_into_edx():
243 *
244 * Usage: load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
245 */
246 static void load_into_eax_and_sign_extend_into_edx(unsigned char **addrp, void *p)
247 {
248 unsigned char *a;
249 int ofs = (size_t)p - (size_t)&dummy_cpu;
250 a = *addrp;
251
252 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
253 *a++ = 0x8b; *a++ = 0x86;
254 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
255
256 /* 99 cltd */
257 *a++ = 0x99;
258
259 *addrp = a;
260 }
261
262
263 /*
264 * load_into_eax_dont_care_about_edx():
265 *
266 * Usage: load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
267 */
268 static void load_into_eax_dont_care_about_edx(unsigned char **addrp, void *p)
269 {
270 unsigned char *a;
271 int ofs = (size_t)p - (size_t)&dummy_cpu;
272 a = *addrp;
273
274 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
275 *a++ = 0x8b; *a++ = 0x86;
276 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
277
278 *addrp = a;
279 }
280
281
282 /*
283 * store_eax_edx():
284 *
285 * Usage: store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
286 */
287 static void store_eax_edx(unsigned char **addrp, void *p)
288 {
289 unsigned char *a;
290 int ofs = (size_t)p - (size_t)&dummy_cpu;
291 a = *addrp;
292
293 /* 89 86 38 30 00 00 mov %eax,0x3038(%esi) */
294 *a++ = 0x89; *a++ = 0x86;
295 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
296
297 /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
298 ofs += 4;
299 *a++ = 0x89; *a++ = 0x96;
300 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
301
302 *addrp = a;
303 }
304
305
306 /*
307 * bintrans_write_instruction__lui():
308 */
309 static int bintrans_write_instruction__lui(unsigned char **addrp, int rt, int imm)
310 {
311 unsigned char *a;
312
313 a = *addrp;
314 if (rt == 0)
315 goto rt0;
316
317 /* b8 00 00 dc fe mov $0xfedc0000,%eax */
318 *a++ = 0xb8; *a++ = 0; *a++ = 0;
319 *a++ = imm & 255; *a++ = imm >> 8;
320
321 /* 99 cltd */
322 *a++ = 0x99;
323
324 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
325 *addrp = a;
326
327 rt0:
328 bintrans_write_pc_inc(addrp);
329 return 1;
330 }
331
332
333 /*
334 * bintrans_write_instruction__jr():
335 */
336 static int bintrans_write_instruction__jr(unsigned char **addrp,
337 int rs, int rd, int special)
338 {
339 unsigned char *a;
340 int ofs;
341
342 a = *addrp;
343
344 /*
345 * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
346 * and cpu->delay_jmpaddr = gpr[rs].
347 */
348
349 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
350 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
351 *a++ = 0xc7; *a++ = 0x86;
352 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
353 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
354
355 #if 0
356 if (bintrans_32bit_only)
357 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
358 else
359 #endif
360 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
361
362 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
363
364 if (special == SPECIAL_JALR && rd != 0) {
365 /* gpr[rd] = retaddr (pc + 8) */
366
367 #if 0
368 if (bintrans_32bit_only) {
369 load_pc_into_eax_edx(&a);
370 /* 83 c0 08 add $0x8,%eax */
371 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
372 } else
373 #endif
374 {
375 load_pc_into_eax_edx(&a);
376 /* 83 c0 08 add $0x8,%eax */
377 /* 83 d2 00 adc $0x0,%edx */
378 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
379 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
380 }
381
382 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
383 }
384
385 *addrp = a;
386 bintrans_write_pc_inc(addrp);
387 return 1;
388 }
389
390
391 /*
392 * bintrans_write_instruction__mfmthilo():
393 */
394 static int bintrans_write_instruction__mfmthilo(unsigned char **addrp,
395 int rd, int from_flag, int hi_flag)
396 {
397 unsigned char *a;
398
399 a = *addrp;
400
401 if (from_flag) {
402 if (rd != 0) {
403 /* mfhi or mflo */
404 if (hi_flag)
405 load_into_eax_edx(&a, &dummy_cpu.cd.mips.hi);
406 else
407 load_into_eax_edx(&a, &dummy_cpu.cd.mips.lo);
408 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
409 }
410 } else {
411 /* mthi or mtlo */
412 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
413 if (hi_flag)
414 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
415 else
416 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
417 }
418
419 *addrp = a;
420 bintrans_write_pc_inc(addrp);
421 return 1;
422 }
423
424
425 /*
426 * bintrans_write_instruction__addiu_etc():
427 */
428 static int bintrans_write_instruction__addiu_etc(
429 struct memory *mem, unsigned char **addrp,
430 int rt, int rs, int imm, int instruction_type)
431 {
432 unsigned char *a;
433 unsigned int uimm;
434
435 /* TODO: overflow detection for ADDI and DADDI */
436 switch (instruction_type) {
437 case HI6_ADDI:
438 case HI6_DADDI:
439 return 0;
440 }
441
442 a = *addrp;
443
444 if (rt == 0)
445 goto rt0;
446
447 uimm = imm & 0xffff;
448
449 if (uimm == 0 && (instruction_type == HI6_ADDIU ||
450 instruction_type == HI6_ADDI)) {
451 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
452 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
453 goto rt0;
454 }
455
456 if (uimm == 0 && (instruction_type == HI6_DADDIU ||
457 instruction_type == HI6_DADDI || instruction_type == HI6_ORI)) {
458 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
459 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
460 goto rt0;
461 }
462
463 if (mem->bintrans_32bit_only)
464 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
465 else
466 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
467
468 switch (instruction_type) {
469 case HI6_ADDIU:
470 case HI6_DADDIU:
471 case HI6_ADDI:
472 case HI6_DADDI:
473 if (imm & 0x8000) {
474 /* 05 39 fd ff ff add $0xfffffd39,%eax */
475 /* 83 d2 ff adc $0xffffffff,%edx */
476 *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0xff; *a++ = 0xff;
477 if (instruction_type == HI6_DADDIU) {
478 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
479 }
480 } else {
481 /* 05 c7 02 00 00 add $0x2c7,%eax */
482 /* 83 d2 00 adc $0x0,%edx */
483 *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
484 if (instruction_type == HI6_DADDIU) {
485 *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
486 }
487 }
488 if (instruction_type == HI6_ADDIU) {
489 /* 99 cltd */
490 *a++ = 0x99;
491 }
492 break;
493 case HI6_ANDI:
494 /* 25 34 12 00 00 and $0x1234,%eax */
495 /* 31 d2 xor %edx,%edx */
496 *a++ = 0x25; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
497 *a++ = 0x31; *a++ = 0xd2;
498 break;
499 case HI6_ORI:
500 /* 0d 34 12 00 00 or $0x1234,%eax */
501 *a++ = 0xd; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
502 break;
503 case HI6_XORI:
504 /* 35 34 12 00 00 xor $0x1234,%eax */
505 *a++ = 0x35; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
506 break;
507 case HI6_SLTIU:
508 /* set if less than, unsigned. (compare edx:eax to ecx:ebx) */
509 /* ecx:ebx = the immediate value */
510 /* bb dc fe ff ff mov $0xfffffedc,%ebx */
511 /* b9 ff ff ff ff mov $0xffffffff,%ecx */
512 /* or */
513 /* 29 c9 sub %ecx,%ecx */
514 #if 0
515 if (bintrans_32bit_only) {
516 /* 99 cltd */
517 *a++ = 0x99;
518 }
519 #endif
520 *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
521 if (uimm & 0x8000) {
522 *a++ = 0xff; *a++ = 0xff;
523 *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
524 } else {
525 *a++ = 0; *a++ = 0;
526 *a++ = 0x29; *a++ = 0xc9;
527 }
528
529 /* if edx <= ecx and eax < ebx then 1, else 0. */
530 /* 39 ca cmp %ecx,%edx */
531 /* 77 0b ja <ret0> */
532 /* 39 d8 cmp %ebx,%eax */
533 /* 73 07 jae 58 <ret0> */
534 *a++ = 0x39; *a++ = 0xca;
535 *a++ = 0x77; *a++ = 0x0b;
536 *a++ = 0x39; *a++ = 0xd8;
537 *a++ = 0x73; *a++ = 0x07;
538
539 /* b8 01 00 00 00 mov $0x1,%eax */
540 /* eb 02 jmp <common> */
541 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
542 *a++ = 0xeb; *a++ = 0x02;
543
544 /* ret0: */
545 /* 29 c0 sub %eax,%eax */
546 *a++ = 0x29; *a++ = 0xc0;
547
548 /* common: */
549 /* 99 cltd */
550 *a++ = 0x99;
551 break;
552 case HI6_SLTI:
553 /* set if less than, signed. (compare edx:eax to ecx:ebx) */
554 /* ecx:ebx = the immediate value */
555 /* bb dc fe ff ff mov $0xfffffedc,%ebx */
556 /* b9 ff ff ff ff mov $0xffffffff,%ecx */
557 /* or */
558 /* 29 c9 sub %ecx,%ecx */
559 #if 0
560 if (bintrans_32bit_only) {
561 /* 99 cltd */
562 *a++ = 0x99;
563 }
564 #endif
565 *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
566 if (uimm & 0x8000) {
567 *a++ = 0xff; *a++ = 0xff;
568 *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
569 } else {
570 *a++ = 0; *a++ = 0;
571 *a++ = 0x29; *a++ = 0xc9;
572 }
573
574 /* if edx > ecx then 0. */
575 /* if edx < ecx then 1. */
576 /* if eax < ebx then 1, else 0. */
577 /* 39 ca cmp %ecx,%edx */
578 /* 7c 0a jl <ret1> */
579 /* 7f 04 jg <ret0> */
580 /* 39 d8 cmp %ebx,%eax */
581 /* 7c 04 jl <ret1> */
582 *a++ = 0x39; *a++ = 0xca;
583 *a++ = 0x7c; *a++ = 0x0a;
584 *a++ = 0x7f; *a++ = 0x04;
585 *a++ = 0x39; *a++ = 0xd8;
586 *a++ = 0x7c; *a++ = 0x04;
587
588 /* ret0: */
589 /* 29 c0 sub %eax,%eax */
590 /* eb 05 jmp <common> */
591 *a++ = 0x29; *a++ = 0xc0;
592 *a++ = 0xeb; *a++ = 0x05;
593
594 /* ret1: */
595 /* b8 01 00 00 00 mov $0x1,%eax */
596 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
597
598 /* common: */
599 /* 99 cltd */
600 *a++ = 0x99;
601 break;
602 }
603
604 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
605
606 rt0:
607 *addrp = a;
608 bintrans_write_pc_inc(addrp);
609 return 1;
610 }
611
612
613 /*
614 * bintrans_write_instruction__jal():
615 */
616 static int bintrans_write_instruction__jal(unsigned char **addrp,
617 int imm, int link)
618 {
619 unsigned char *a;
620 uint32_t subimm;
621 int ofs;
622
623 a = *addrp;
624
625 load_pc_into_eax_edx(&a);
626
627 if (link) {
628 /* gpr[31] = pc + 8 */
629 #if 0
630 if (bintrans_32bit_only) {
631 /* 50 push %eax */
632 /* 83 c0 08 add $0x8,%eax */
633 *a++ = 0x50;
634 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
635 } else
636 #endif
637 {
638 /* 50 push %eax */
639 /* 52 push %edx */
640 /* 83 c0 08 add $0x8,%eax */
641 /* 83 d2 00 adc $0x0,%edx */
642 *a++ = 0x50;
643 *a++ = 0x52;
644 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
645 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
646 }
647 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[31]);
648 #if 0
649 if (bintrans_32bit_only) {
650 /* 58 pop %eax */
651 *a++ = 0x58;
652 } else
653 #endif
654 {
655 /* 5a pop %edx */
656 /* 58 pop %eax */
657 *a++ = 0x5a;
658 *a++ = 0x58;
659 }
660 }
661
662 /* delay_jmpaddr = top 36 bits of pc together with lowest 28 bits of imm*4: */
663 imm *= 4;
664
665 /* Add 4, because the jump is from the delay slot: */
666 /* 83 c0 04 add $0x4,%eax */
667 /* 83 d2 00 adc $0x0,%edx */
668 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x04;
669 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
670
671 /* c1 e8 1c shr $0x1c,%eax */
672 /* c1 e0 1c shl $0x1c,%eax */
673 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x1c;
674 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x1c;
675
676 subimm = imm;
677 subimm &= 0x0fffffff;
678
679 /* 0d 78 56 34 12 or $0x12345678,%eax */
680 *a++ = 0x0d; *a++ = subimm; *a++ = subimm >> 8;
681 *a++ = subimm >> 16; *a++ = subimm >> 24;
682
683 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
684
685 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
686 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
687 *a++ = 0xc7; *a++ = 0x86;
688 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
689 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
690
691 *addrp = a;
692 bintrans_write_pc_inc(addrp);
693 return 1;
694 }
695
696
697 /*
698 * bintrans_write_instruction__addu_etc():
699 */
700 static int bintrans_write_instruction__addu_etc(
701 struct memory *mem, unsigned char **addrp,
702 int rd, int rs, int rt, int sa, int instruction_type)
703 {
704 unsigned char *a;
705 int load64 = 0, do_store = 1;
706
707 /* TODO: Not yet */
708 switch (instruction_type) {
709 case SPECIAL_MULT:
710 case SPECIAL_MULTU:
711 case SPECIAL_DIV:
712 case SPECIAL_DIVU:
713 if (rd != 0)
714 return 0;
715 break;
716 case SPECIAL_DSLL:
717 case SPECIAL_DSLL32:
718 case SPECIAL_DSRA:
719 case SPECIAL_DSRA32:
720 case SPECIAL_DSRL:
721 case SPECIAL_DSRL32:
722 case SPECIAL_MOVZ:
723 case SPECIAL_MOVN:
724 bintrans_write_chunkreturn_fail(addrp);
725 return 0;
726 case SPECIAL_SLT:
727 case SPECIAL_SLTU:
728 if (!mem->bintrans_32bit_only) {
729 bintrans_write_chunkreturn_fail(addrp);
730 return 0;
731 }
732 break;
733 }
734
735 switch (instruction_type) {
736 case SPECIAL_DADDU:
737 case SPECIAL_DSUBU:
738 case SPECIAL_OR:
739 case SPECIAL_AND:
740 case SPECIAL_NOR:
741 case SPECIAL_XOR:
742 case SPECIAL_DSLL:
743 case SPECIAL_DSRL:
744 case SPECIAL_DSRA:
745 case SPECIAL_DSLL32:
746 case SPECIAL_DSRL32:
747 case SPECIAL_DSRA32:
748 load64 = 1;
749 }
750
751 switch (instruction_type) {
752 case SPECIAL_MULT:
753 case SPECIAL_MULTU:
754 case SPECIAL_DIV:
755 case SPECIAL_DIVU:
756 break;
757 default:
758 if (rd == 0)
759 goto rd0;
760 }
761
762 a = *addrp;
763
764 if ((instruction_type == SPECIAL_ADDU || instruction_type == SPECIAL_DADDU
765 || instruction_type == SPECIAL_OR) && rt == 0) {
766 if (load64)
767 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
768 else
769 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
770 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
771 *addrp = a;
772 goto rd0;
773 }
774
775 /* edx:eax = rs, ecx:ebx = rt */
776 if (load64 && !mem->bintrans_32bit_only) {
777 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
778 /* 89 c3 mov %eax,%ebx */
779 /* 89 d1 mov %edx,%ecx */
780 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
781 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
782 } else {
783 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
784 /* 89 c3 mov %eax,%ebx */
785 /* 89 d1 mov %edx,%ecx */
786 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
787 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
788 }
789
790 switch (instruction_type) {
791 case SPECIAL_ADDU:
792 /* 01 d8 add %ebx,%eax */
793 /* 99 cltd */
794 *a++ = 0x01; *a++ = 0xd8;
795 *a++ = 0x99;
796 break;
797 case SPECIAL_DADDU:
798 /* 01 d8 add %ebx,%eax */
799 /* 11 ca adc %ecx,%edx */
800 *a++ = 0x01; *a++ = 0xd8;
801 *a++ = 0x11; *a++ = 0xca;
802 break;
803 case SPECIAL_SUBU:
804 /* 29 d8 sub %ebx,%eax */
805 /* 99 cltd */
806 *a++ = 0x29; *a++ = 0xd8;
807 *a++ = 0x99;
808 break;
809 case SPECIAL_DSUBU:
810 /* 29 d8 sub %ebx,%eax */
811 /* 19 ca sbb %ecx,%edx */
812 *a++ = 0x29; *a++ = 0xd8;
813 *a++ = 0x19; *a++ = 0xca;
814 break;
815 case SPECIAL_AND:
816 /* 21 d8 and %ebx,%eax */
817 /* 21 ca and %ecx,%edx */
818 *a++ = 0x21; *a++ = 0xd8;
819 *a++ = 0x21; *a++ = 0xca;
820 break;
821 case SPECIAL_OR:
822 /* 09 d8 or %ebx,%eax */
823 /* 09 ca or %ecx,%edx */
824 *a++ = 0x09; *a++ = 0xd8;
825 *a++ = 0x09; *a++ = 0xca;
826 break;
827 case SPECIAL_NOR:
828 /* 09 d8 or %ebx,%eax */
829 /* 09 ca or %ecx,%edx */
830 /* f7 d0 not %eax */
831 /* f7 d2 not %edx */
832 *a++ = 0x09; *a++ = 0xd8;
833 *a++ = 0x09; *a++ = 0xca;
834 *a++ = 0xf7; *a++ = 0xd0;
835 *a++ = 0xf7; *a++ = 0xd2;
836 break;
837 case SPECIAL_XOR:
838 /* 31 d8 xor %ebx,%eax */
839 /* 31 ca xor %ecx,%edx */
840 *a++ = 0x31; *a++ = 0xd8;
841 *a++ = 0x31; *a++ = 0xca;
842 break;
843 case SPECIAL_SLL:
844 /* 89 d8 mov %ebx,%eax */
845 /* c1 e0 1f shl $0x1f,%eax */
846 /* 99 cltd */
847 *a++ = 0x89; *a++ = 0xd8;
848 if (sa == 1) {
849 *a++ = 0xd1; *a++ = 0xe0;
850 } else {
851 *a++ = 0xc1; *a++ = 0xe0; *a++ = sa;
852 }
853 *a++ = 0x99;
854 break;
855 case SPECIAL_SRA:
856 /* 89 d8 mov %ebx,%eax */
857 /* c1 f8 1f sar $0x1f,%eax */
858 /* 99 cltd */
859 *a++ = 0x89; *a++ = 0xd8;
860 if (sa == 1) {
861 *a++ = 0xd1; *a++ = 0xf8;
862 } else {
863 *a++ = 0xc1; *a++ = 0xf8; *a++ = sa;
864 }
865 *a++ = 0x99;
866 break;
867 case SPECIAL_SRL:
868 /* 89 d8 mov %ebx,%eax */
869 /* c1 e8 1f shr $0x1f,%eax */
870 /* 99 cltd */
871 *a++ = 0x89; *a++ = 0xd8;
872 if (sa == 1) {
873 *a++ = 0xd1; *a++ = 0xe8;
874 } else {
875 *a++ = 0xc1; *a++ = 0xe8; *a++ = sa;
876 }
877 *a++ = 0x99;
878 break;
879 case SPECIAL_SLTU:
880 /* NOTE: 32-bit ONLY! */
881 /* set if less than, unsigned. (compare eax to ebx) */
882 /* 39 d8 cmp %ebx,%eax */
883 /* 73 07 jae 58 <ret0> */
884 *a++ = 0x39; *a++ = 0xd8;
885 *a++ = 0x73; *a++ = 0x07;
886
887 /* b8 01 00 00 00 mov $0x1,%eax */
888 /* eb 02 jmp <common> */
889 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
890 *a++ = 0xeb; *a++ = 0x02;
891
892 /* ret0: */
893 /* 29 c0 sub %eax,%eax */
894 *a++ = 0x29; *a++ = 0xc0;
895
896 /* common: */
897 /* 99 cltd */
898 *a++ = 0x99;
899 break;
900 case SPECIAL_SLT:
901 /* NOTE: 32-bit ONLY! */
902 /* set if less than, signed. (compare eax to ebx) */
903 /* 39 d8 cmp %ebx,%eax */
904 /* 7c 04 jl <ret1> */
905 *a++ = 0x39; *a++ = 0xd8;
906 *a++ = 0x7c; *a++ = 0x04;
907
908 /* ret0: */
909 /* 29 c0 sub %eax,%eax */
910 /* eb 05 jmp <common> */
911 *a++ = 0x29; *a++ = 0xc0;
912 *a++ = 0xeb; *a++ = 0x05;
913
914 /* ret1: */
915 /* b8 01 00 00 00 mov $0x1,%eax */
916 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
917
918 /* common: */
919 /* 99 cltd */
920 *a++ = 0x99;
921 break;
922 case SPECIAL_SLLV:
923 /* rd = rt << (rs&31) (logical) eax = ebx << (eax&31) */
924 /* xchg ebx,eax, then we can do eax = eax << (ebx&31) */
925 /* 93 xchg %eax,%ebx */
926 /* 89 d9 mov %ebx,%ecx */
927 /* 83 e1 1f and $0x1f,%ecx */
928 /* d3 e0 shl %cl,%eax */
929 *a++ = 0x93;
930 *a++ = 0x89; *a++ = 0xd9;
931 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
932 *a++ = 0xd3; *a++ = 0xe0;
933 /* 99 cltd */
934 *a++ = 0x99;
935 break;
936 case SPECIAL_SRLV:
937 /* rd = rt >> (rs&31) (logical) eax = ebx >> (eax&31) */
938 /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
939 /* 93 xchg %eax,%ebx */
940 /* 89 d9 mov %ebx,%ecx */
941 /* 83 e1 1f and $0x1f,%ecx */
942 /* d3 e8 shr %cl,%eax */
943 *a++ = 0x93;
944 *a++ = 0x89; *a++ = 0xd9;
945 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
946 *a++ = 0xd3; *a++ = 0xe8;
947 /* 99 cltd */
948 *a++ = 0x99;
949 break;
950 case SPECIAL_SRAV:
951 /* rd = rt >> (rs&31) (arithmetic) eax = ebx >> (eax&31) */
952 /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
953 /* 93 xchg %eax,%ebx */
954 /* 89 d9 mov %ebx,%ecx */
955 /* 83 e1 1f and $0x1f,%ecx */
956 /* d3 f8 sar %cl,%eax */
957 *a++ = 0x93;
958 *a++ = 0x89; *a++ = 0xd9;
959 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
960 *a++ = 0xd3; *a++ = 0xf8;
961 /* 99 cltd */
962 *a++ = 0x99;
963 break;
964 case SPECIAL_MULT:
965 case SPECIAL_MULTU:
966 /* 57 push %edi */
967 *a++ = 0x57;
968 if (instruction_type == SPECIAL_MULT) {
969 /* f7 eb imul %ebx */
970 *a++ = 0xf7; *a++ = 0xeb;
971 } else {
972 /* f7 e3 mul %ebx */
973 *a++ = 0xf7; *a++ = 0xe3;
974 }
975 /* here: edx:eax = hi:lo */
976 /* 89 d7 mov %edx,%edi */
977 /* 99 cltd */
978 *a++ = 0x89; *a++ = 0xd7;
979 *a++ = 0x99;
980 /* here: edi=hi, edx:eax = sign-extended lo */
981 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
982 /* 89 f8 mov %edi,%eax */
983 /* 99 cltd */
984 *a++ = 0x89; *a++ = 0xf8;
985 *a++ = 0x99;
986 /* here: edx:eax = sign-extended hi */
987 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
988 /* 5f pop %edi */
989 *a++ = 0x5f;
990 do_store = 0;
991 break;
992 case SPECIAL_DIV:
993 case SPECIAL_DIVU:
994 /*
995 * In: edx:eax = rs, ecx:ebx = rt
996 * Out: LO = rs / rt, HI = rs % rt
997 */
998 /* Division by zero on MIPS is undefined, but on
999 i386 it causes an exception, so we'll try to
1000 avoid that. */
1001 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00; /* cmp $0x0,%ebx */
1002 *a++ = 0x75; *a++ = 0x01; /* jne skip_inc */
1003 *a++ = 0x43; /* inc %ebx */
1004
1005 /* 57 push %edi */
1006 *a++ = 0x57;
1007 if (instruction_type == SPECIAL_DIV) {
1008 *a++ = 0x99; /* cltd */
1009 *a++ = 0xf7; *a++ = 0xfb; /* idiv %ebx */
1010 } else {
1011 *a++ = 0x29; *a++ = 0xd2; /* sub %edx,%edx */
1012 *a++ = 0xf7; *a++ = 0xf3; /* div %ebx */
1013 }
1014 /* here: edx:eax = hi:lo */
1015 /* 89 d7 mov %edx,%edi */
1016 /* 99 cltd */
1017 *a++ = 0x89; *a++ = 0xd7;
1018 *a++ = 0x99;
1019 /* here: edi=hi, edx:eax = sign-extended lo */
1020 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
1021 /* 89 f8 mov %edi,%eax */
1022 /* 99 cltd */
1023 *a++ = 0x89; *a++ = 0xf8;
1024 *a++ = 0x99;
1025 /* here: edx:eax = sign-extended hi */
1026 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
1027 /* 5f pop %edi */
1028 *a++ = 0x5f;
1029 do_store = 0;
1030 break;
1031 #if 0
1032 /* TODO: These are from bintrans_alpha.c. Translate them to i386. */
1033
1034 case SPECIAL_DSLL:
1035 *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1036 break;
1037 case SPECIAL_DSLL32:
1038 sa += 32;
1039 *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1040 break;
1041 case SPECIAL_DSRA:
1042 *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1043 break;
1044 case SPECIAL_DSRA32:
1045 sa += 32;
1046 *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1047 break;
1048 case SPECIAL_DSRL:
1049 /* Note: bits of sa are distributed among two different bytes. */
1050 *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1051 break;
1052 case SPECIAL_DSRL32:
1053 /* Note: bits of sa are distributed among two different bytes. */
1054 sa += 32;
1055 *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1056 break;
1057 #endif
1058 }
1059
1060 if (do_store)
1061 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
1062
1063 *addrp = a;
1064 rd0:
1065 bintrans_write_pc_inc(addrp);
1066 return 1;
1067 }
1068
1069
1070 /*
1071 * bintrans_write_instruction__mfc_mtc():
1072 */
1073 static int bintrans_write_instruction__mfc_mtc(struct memory *mem,
1074 unsigned char **addrp, int coproc_nr, int flag64bit, int rt,
1075 int rd, int mtcflag)
1076 {
1077 unsigned char *a, *failskip;
1078 int ofs;
1079
1080 if (mtcflag && flag64bit) {
1081 /* dmtc */
1082 return 0;
1083 }
1084
1085 /*
1086 * NOTE: Only a few registers are readable without side effects.
1087 */
1088 if (rt == 0 && !mtcflag)
1089 return 0;
1090
1091 if (coproc_nr >= 1)
1092 return 0;
1093
1094 if (rd == COP0_RANDOM || rd == COP0_COUNT)
1095 return 0;
1096
1097 a = *addrp;
1098
1099 /*************************************************************
1100 *
1101 * TODO: Check for kernel mode, or Coproc X usability bit!
1102 *
1103 *************************************************************/
1104
1105 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1106 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1107 *a++ = 0x8b; *a++ = 0x96;
1108 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1109
1110 /* here, edx = cpu->coproc[0] */
1111
1112 if (mtcflag) {
1113 /* mtc */
1114
1115 /* TODO: This code only works for mtc0, not dmtc0 */
1116
1117 /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1118 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1119 *a++ = 0x8b; *a++ = 0x9a;
1120 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1121
1122 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1123
1124 /*
1125 * Here: eax contains the value in register rt,
1126 * ebx contains the coproc register rd value.
1127 *
1128 * In the general case, only allow mtc if it does not
1129 * change the coprocessor register!
1130 */
1131
1132 switch (rd) {
1133
1134 case COP0_INDEX:
1135 break;
1136
1137 case COP0_ENTRYLO0:
1138 case COP0_ENTRYLO1:
1139 /* TODO: Not all bits are writable! */
1140 break;
1141
1142 case COP0_EPC:
1143 break;
1144
1145 case COP0_STATUS:
1146 /* Only allow updates to the status register if
1147 the interrupt enable bits were changed, but no
1148 other bits! */
1149 /* 89 c1 mov %eax,%ecx */
1150 /* 89 da mov %ebx,%edx */
1151 /* 81 e1 00 00 e7 0f and $0x0fe70000,%ecx */
1152 /* 81 e2 00 00 e7 0f and $0x0fe70000,%edx */
1153 /* 39 ca cmp %ecx,%edx */
1154 /* 74 01 je <ok> */
1155 *a++ = 0x89; *a++ = 0xc1;
1156 *a++ = 0x89; *a++ = 0xda;
1157 *a++ = 0x81; *a++ = 0xe1; *a++ = 0x00; *a++ = 0x00;
1158 if (mem->bintrans_32bit_only) {
1159 *a++ = 0xe7; *a++ = 0x0f;
1160 } else {
1161 *a++ = 0xff; *a++ = 0xff;
1162 }
1163 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00;
1164 if (mem->bintrans_32bit_only) {
1165 *a++ = 0xe7; *a++ = 0x0f;
1166 } else {
1167 *a++ = 0xff; *a++ = 0xff;
1168 }
1169 *a++ = 0x39; *a++ = 0xca;
1170 *a++ = 0x74; failskip = a; *a++ = 0x00;
1171 bintrans_write_chunkreturn_fail(&a);
1172 *failskip = (size_t)a - (size_t)failskip - 1;
1173
1174 /* Only allow the update if it would NOT cause
1175 an interrupt exception: */
1176
1177 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1178 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1179 *a++ = 0x8b; *a++ = 0x96;
1180 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1181
1182 /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1183 ofs = ((size_t)&dummy_coproc.reg[COP0_CAUSE]) - (size_t)&dummy_coproc;
1184 *a++ = 0x8b; *a++ = 0x9a;
1185 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1186
1187 /* 21 c3 and %eax,%ebx */
1188 /* 81 e3 00 ff 00 00 and $0xff00,%ebx */
1189 /* 83 fb 00 cmp $0x0,%ebx */
1190 /* 74 01 je <ok> */
1191 *a++ = 0x21; *a++ = 0xc3;
1192 *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00;
1193 *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
1194 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
1195 *a++ = 0x74; failskip = a; *a++ = 0x00;
1196 bintrans_write_chunkreturn_fail(&a);
1197 *failskip = (size_t)a - (size_t)failskip - 1;
1198
1199 break;
1200
1201 default:
1202 /* 39 d8 cmp %ebx,%eax */
1203 /* 74 01 je <ok> */
1204 *a++ = 0x39; *a++ = 0xd8;
1205 *a++ = 0x74; failskip = a; *a++ = 0x00;
1206 bintrans_write_chunkreturn_fail(&a);
1207 *failskip = (size_t)a - (size_t)failskip - 1;
1208 }
1209
1210 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1211 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1212 *a++ = 0x8b; *a++ = 0x96;
1213 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1214
1215 /* 8d 9a 38 30 00 00 lea 0x3038(%edx),%ebx */
1216 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1217 *a++ = 0x8d; *a++ = 0x9a;
1218 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1219
1220 /* Sign-extend eax into edx:eax, and store it in
1221 coprocessor register rd: */
1222 /* 99 cltd */
1223 *a++ = 0x99;
1224
1225 /* 89 03 mov %eax,(%ebx) */
1226 /* 89 53 04 mov %edx,0x4(%ebx) */
1227 *a++ = 0x89; *a++ = 0x03;
1228 *a++ = 0x89; *a++ = 0x53; *a++ = 0x04;
1229 } else {
1230 /* mfc */
1231
1232 /* 8b 82 38 30 00 00 mov 0x3038(%edx),%eax */
1233 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1234 *a++ = 0x8b; *a++ = 0x82;
1235 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1236
1237 if (flag64bit) {
1238 /* Load high 32 bits: (note: edx gets overwritten) */
1239 /* 8b 92 3c 30 00 00 mov 0x303c(%edx),%edx */
1240 ofs += 4;
1241 *a++ = 0x8b; *a++ = 0x92;
1242 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1243 } else {
1244 /* 99 cltd */
1245 *a++ = 0x99;
1246 }
1247
1248 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1249 }
1250
1251 *addrp = a;
1252 bintrans_write_pc_inc(addrp);
1253 return 1;
1254 }
1255
1256
1257 /*
1258 * bintrans_write_instruction__branch():
1259 */
1260 static int bintrans_write_instruction__branch(unsigned char **addrp,
1261 int instruction_type, int regimm_type, int rt, int rs, int imm)
1262 {
1263 unsigned char *a;
1264 unsigned char *skip1 = NULL, *skip2 = NULL;
1265 int ofs, likely = 0;
1266
1267 switch (instruction_type) {
1268 case HI6_BEQL:
1269 case HI6_BNEL:
1270 case HI6_BLEZL:
1271 case HI6_BGTZL:
1272 likely = 1;
1273 }
1274
1275 /* TODO: See the Alpha backend on how these could be implemented: */
1276 if (likely)
1277 return 0;
1278
1279 a = *addrp;
1280
1281 /*
1282 * edx:eax = gpr[rs]; ecx:ebx = gpr[rt];
1283 *
1284 * Compare for equality (BEQ).
1285 * If the result was zero, then it means equality; perform the
1286 * delayed jump. Otherwise: skip.
1287 */
1288
1289 switch (instruction_type) {
1290 case HI6_BEQ:
1291 case HI6_BNE:
1292 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1293 /* 89 c3 mov %eax,%ebx */
1294 /* 89 d1 mov %edx,%ecx */
1295 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
1296 }
1297 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1298
1299 if (instruction_type == HI6_BEQ && rt != rs) {
1300 /* If rt != rs, then skip. */
1301 /* 39 c3 cmp %eax,%ebx */
1302 /* 75 05 jne 155 <skip> */
1303 /* 39 d1 cmp %edx,%ecx */
1304 /* 75 01 jne 155 <skip> */
1305 *a++ = 0x39; *a++ = 0xc3;
1306 *a++ = 0x75; skip1 = a; *a++ = 0x00;
1307 #if 0
1308 if (!bintrans_32bit_only)
1309 #endif
1310 {
1311 *a++ = 0x39; *a++ = 0xd1;
1312 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1313 }
1314 }
1315
1316 if (instruction_type == HI6_BNE) {
1317 /* If rt != rs, then ok. Otherwise skip. */
1318 #if 0
1319 if (bintrans_32bit_only) {
1320 /* 39 c3 cmp %eax,%ebx */
1321 /* 74 xx je <skip> */
1322 *a++ = 0x39; *a++ = 0xc3;
1323 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1324 } else
1325 #endif
1326 {
1327 /* 39 c3 cmp %eax,%ebx */
1328 /* 75 06 jne 156 <bra> */
1329 /* 39 d1 cmp %edx,%ecx */
1330 /* 75 02 jne 156 <bra> */
1331 /* eb 01 jmp 157 <skip> */
1332 *a++ = 0x39; *a++ = 0xc3;
1333 *a++ = 0x75; *a++ = 0x06;
1334 *a++ = 0x39; *a++ = 0xd1;
1335 *a++ = 0x75; *a++ = 0x02;
1336 *a++ = 0xeb; skip2 = a; *a++ = 0x00;
1337 }
1338 }
1339
1340 if (instruction_type == HI6_BLEZ) {
1341 /* If both eax and edx are zero, then do the branch. */
1342 /* 83 f8 00 cmp $0x0,%eax */
1343 /* 75 07 jne <nott> */
1344 /* 83 fa 00 cmp $0x0,%edx */
1345 /* 75 02 jne 23d <nott> */
1346 /* eb 01 jmp <branch> */
1347 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1348 *a++ = 0x75; *a++ = 0x07;
1349 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1350 *a++ = 0x75; *a++ = 0x02;
1351 *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1352
1353 /* If high bit of edx is set, then rs < 0. */
1354 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1355 /* 74 00 jz skip */
1356 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1357 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1358
1359 if (skip1 != NULL)
1360 *skip1 = (size_t)a - (size_t)skip1 - 1;
1361 skip1 = NULL;
1362 }
1363 if (instruction_type == HI6_BGTZ) {
1364 /* If both eax and edx are zero, then skip the branch. */
1365 /* 83 f8 00 cmp $0x0,%eax */
1366 /* 75 07 jne <nott> */
1367 /* 83 fa 00 cmp $0x0,%edx */
1368 /* 75 02 jne 23d <nott> */
1369 /* eb 01 jmp <skip> */
1370 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1371 *a++ = 0x75; *a++ = 0x07;
1372 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1373 *a++ = 0x75; *a++ = 0x02;
1374 *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1375
1376 /* If high bit of edx is set, then rs < 0. */
1377 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1378 /* 75 00 jnz skip */
1379 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1380 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1381 }
1382 if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BLTZ) {
1383 /* If high bit of edx is set, then rs < 0. */
1384 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1385 /* 74 00 jz skip */
1386 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1387 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1388 }
1389 if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BGEZ) {
1390 /* If high bit of edx is not set, then rs >= 0. */
1391 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1392 /* 75 00 jnz skip */
1393 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1394 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1395 }
1396
1397 /*
1398 * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
1399 * and cpu->delay_jmpaddr = pc + 4 + (imm << 2).
1400 */
1401
1402 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
1403 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1404 *a++ = 0xc7; *a++ = 0x86;
1405 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1406 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
1407
1408 load_pc_into_eax_edx(&a);
1409
1410 /* 05 78 56 34 12 add $0x12345678,%eax */
1411 /* 83 d2 00 adc $0x0,%edx */
1412 /* or */
1413 /* 83 d2 ff adc $0xffffffff,%edx */
1414 imm = (imm << 2) + 4;
1415 *a++ = 0x05; *a++ = imm; *a++ = imm >> 8; *a++ = imm >> 16; *a++ = imm >> 24;
1416 if (imm >= 0) {
1417 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
1418 } else {
1419 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1420 }
1421 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1422
1423 if (skip1 != NULL)
1424 *skip1 = (size_t)a - (size_t)skip1 - 1;
1425 if (skip2 != NULL)
1426 *skip2 = (size_t)a - (size_t)skip2 - 1;
1427
1428 *addrp = a;
1429 bintrans_write_pc_inc(addrp);
1430 return 1;
1431 }
1432
1433
1434 /*
1435 * bintrans_write_instruction__delayedbranch():
1436 */
1437 static int bintrans_write_instruction__delayedbranch(struct memory *mem,
1438 unsigned char **addrp, uint32_t *potential_chunk_p, uint32_t *chunks,
1439 int only_care_about_chunk_p, int p, int forward)
1440 {
1441 unsigned char *a, *skip=NULL, *failskip;
1442 int ofs;
1443 uint32_t i386_addr;
1444
1445 a = *addrp;
1446
1447 if (only_care_about_chunk_p)
1448 goto try_chunk_p;
1449
1450 /* Skip all of this if there is no branch: */
1451 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1452
1453 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
1454 *a++ = 0x8b; *a++ = 0x86;
1455 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1456
1457 /* 83 f8 00 cmp $0x0,%eax */
1458 /* 74 01 je 16b <skippa> */
1459 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1460 *a++ = 0x74; skip = a; *a++ = 0;
1461
1462 /*
1463 * Perform the jump by setting cpu->delay_slot = 0
1464 * and pc = cpu->delay_jmpaddr.
1465 */
1466
1467 /* c7 86 38 30 00 00 00 00 00 00 movl $0x0,0x3038(%esi) */
1468 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1469 *a++ = 0xc7; *a++ = 0x86;
1470 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1471 *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0;
1472
1473 /* REMEMBER old pc: */
1474 load_pc_into_eax_edx(&a);
1475 /* 89 c3 mov %eax,%ebx */
1476 /* 89 d1 mov %edx,%ecx */
1477 *a++ = 0x89; *a++ = 0xc3;
1478 *a++ = 0x89; *a++ = 0xd1;
1479 load_into_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1480 store_eax_edx_into_pc(&a);
1481
1482 try_chunk_p:
1483
1484 if (potential_chunk_p == NULL) {
1485 if (mem->bintrans_32bit_only) {
1486 #if 1
1487 /* 8b 86 78 56 34 12 mov 0x12345678(%esi),%eax */
1488 /* ff e0 jmp *%eax */
1489 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc) - (size_t)&dummy_cpu;
1490 *a++ = 0x8b; *a++ = 0x86;
1491 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1492 *a++ = 0xff; *a++ = 0xe0;
1493
1494 #else
1495 /* Don't execute too many instructions. */
1496 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1497 /* 7c 01 jl <okk> */
1498 /* c3 ret */
1499 *a++ = 0x81; *a++ = 0xfd;
1500 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1501 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1502 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1503 bintrans_write_chunkreturn_fail(&a);
1504 *failskip = (size_t)a - (size_t)failskip - 1;
1505
1506 /*
1507 * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
1508 *
1509 * 89 c3 mov %eax,%ebx
1510 * c1 eb 14 shr $20,%ebx
1511 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1512 */
1513 *a++ = 0x89; *a++ = 0xc3;
1514 *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x14;
1515 *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1516
1517 /*
1518 * ecx = vaddr_to_hostaddr_table0
1519 *
1520 * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
1521 */
1522 ofs = ((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - (size_t)&dummy_cpu;
1523 *a++ = 0x8b; *a++ = 0x8e;
1524 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1525
1526 /*
1527 * ecx = vaddr_to_hostaddr_table0[a]
1528 *
1529 * 8b 0c 19 mov (%ecx,%ebx),%ecx
1530 */
1531 *a++ = 0x8b; *a++ = 0x0c; *a++ = 0x19;
1532
1533 /*
1534 * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
1535 *
1536 * 89 c3 mov %eax,%ebx
1537 * c1 eb 0a shr $10,%ebx
1538 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1539 */
1540 *a++ = 0x89; *a++ = 0xc3;
1541 *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x0a;
1542 *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1543
1544 /*
1545 * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
1546 *
1547 * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
1548 */
1549 ofs = (size_t)&dummy_vth32_table.cd.mips.bintrans_chunks[0]
1550 - (size_t)&dummy_vth32_table;
1551
1552 *a++ = 0x8b; *a++ = 0x8c; *a++ = 0x19;
1553 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1554
1555 /*
1556 * ecx = NULL? Then return with failure.
1557 *
1558 * 83 f9 00 cmp $0x0,%ecx
1559 * 75 01 jne <okzzz>
1560 */
1561 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1562 *a++ = 0x75; fail = a; *a++ = 0x00;
1563 bintrans_write_chunkreturn(&a);
1564 *fail = (size_t)a - (size_t)fail - 1;
1565
1566 /*
1567 * 25 fc 0f 00 00 and $0xffc,%eax
1568 * 01 c1 add %eax,%ecx
1569 *
1570 * 8b 01 mov (%ecx),%eax
1571 *
1572 * 83 f8 00 cmp $0x0,%eax
1573 * 75 01 jne <ok>
1574 * c3 ret
1575 */
1576 *a++ = 0x25; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1577 *a++ = 0x01; *a++ = 0xc1;
1578
1579 *a++ = 0x8b; *a++ = 0x01;
1580
1581 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1582 *a++ = 0x75; fail = a; *a++ = 0x01;
1583 bintrans_write_chunkreturn(&a);
1584 *fail = (size_t)a - (size_t)fail - 1;
1585
1586 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1587 /* ff e0 jmp *%eax */
1588 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1589 *a++ = 0x03; *a++ = 0x86;
1590 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1591 *a++ = 0xff; *a++ = 0xe0;
1592 #endif
1593 } else {
1594 /* Not much we can do here if this wasn't to the same physical page... */
1595
1596 /* Don't execute too many instructions. */
1597 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1598 /* 7c 01 jl <okk> */
1599 /* c3 ret */
1600 *a++ = 0x81; *a++ = 0xfd;
1601 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1602 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1603 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1604 bintrans_write_chunkreturn_fail(&a);
1605 *failskip = (size_t)a - (size_t)failskip - 1;
1606
1607 /*
1608 * Compare the old pc (ecx:ebx) and the new pc (edx:eax). If they are on the
1609 * same virtual page (which means that they are on the same physical
1610 * page), then we can check the right chunk pointer, and if it
1611 * is non-NULL, then we can jump there. Otherwise just return.
1612 */
1613
1614 /* Subtract 4 from the old pc first. (This is where the jump originated from.) */
1615 /* 83 eb 04 sub $0x4,%ebx */
1616 /* 83 d9 00 sbb $0x0,%ecx */
1617 *a++ = 0x83; *a++ = 0xeb; *a++ = 0x04;
1618 *a++ = 0x83; *a++ = 0xd9; *a++ = 0x00;
1619
1620 /* 39 d1 cmp %edx,%ecx */
1621 /* 74 01 je 1b9 <ok2> */
1622 /* c3 ret */
1623 *a++ = 0x39; *a++ = 0xd1;
1624 *a++ = 0x74; *a++ = 0x01;
1625 *a++ = 0xc3;
1626
1627 /* Remember new pc: */
1628 /* 89 c1 mov %eax,%ecx */
1629 *a++ = 0x89; *a++ = 0xc1;
1630
1631 /* 81 e3 00 f0 ff ff and $0xfffff000,%ebx */
1632 /* 25 00 f0 ff ff and $0xfffff000,%eax */
1633 *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1634 *a++ = 0x25; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1635
1636 /* 39 c3 cmp %eax,%ebx */
1637 /* 74 01 je <ok1> */
1638 /* c3 ret */
1639 *a++ = 0x39; *a++ = 0xc3;
1640 *a++ = 0x74; *a++ = 0x01;
1641 *a++ = 0xc3;
1642
1643 /* 81 e1 ff 0f 00 00 and $0xfff,%ecx */
1644 *a++ = 0x81; *a++ = 0xe1; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1645
1646 /* 8b 81 78 56 34 12 mov 0x12345678(%ecx),%eax */
1647 ofs = (size_t)chunks;
1648 *a++ = 0x8b; *a++ = 0x81; *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1649
1650 /* 83 f8 00 cmp $0x0,%eax */
1651 /* 75 01 jne 1cd <okjump> */
1652 /* c3 ret */
1653 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1654 *a++ = 0x75; *a++ = 0x01;
1655 *a++ = 0xc3;
1656
1657 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1658 /* ff e0 jmp *%eax */
1659 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1660 *a++ = 0x03; *a++ = 0x86;
1661 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1662 *a++ = 0xff; *a++ = 0xe0;
1663 }
1664 } else {
1665 /*
1666 * Just to make sure that we don't become too unreliant
1667 * on the main program loop, we need to return every once
1668 * in a while (interrupts etc).
1669 *
1670 * Load the "nr of instructions executed" (which is an int)
1671 * and see if it is below a certain threshold. If so, then
1672 * we go on with the fast path (bintrans), otherwise we
1673 * abort by returning.
1674 */
1675 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1676 /* 7c 01 jl <okk> */
1677 /* c3 ret */
1678 if (!only_care_about_chunk_p && !forward) {
1679 *a++ = 0x81; *a++ = 0xfd;
1680 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1681 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1682 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1683 bintrans_write_chunkreturn_fail(&a);
1684 *failskip = (size_t)a - (size_t)failskip - 1;
1685 }
1686
1687 /*
1688 * potential_chunk_p points to an "uint32_t".
1689 * If this value is non-NULL, then it is a piece of i386
1690 * machine language code corresponding to the address
1691 * we're jumping to. Otherwise, those instructions haven't
1692 * been translated yet, so we have to return to the main
1693 * loop. (Actually, we have to add cpu->chunk_base_address.)
1694 *
1695 * Case 1: The value is non-NULL already at translation
1696 * time. Then we can make a direct (fast) native
1697 * i386 jump to the code chunk.
1698 *
1699 * Case 2: The value was NULL at translation time, then we
1700 * have to check during runtime.
1701 */
1702
1703 /* Case 1: */
1704 /* printf("%08x ", *potential_chunk_p); */
1705 i386_addr = *potential_chunk_p +
1706 (size_t)mem->translation_code_chunk_space;
1707 i386_addr = i386_addr - ((size_t)a + 5);
1708 if ((*potential_chunk_p) != 0) {
1709 *a++ = 0xe9;
1710 *a++ = i386_addr;
1711 *a++ = i386_addr >> 8;
1712 *a++ = i386_addr >> 16;
1713 *a++ = i386_addr >> 24;
1714 } else {
1715 /* Case 2: */
1716
1717 bintrans_register_potential_quick_jump(mem, a, p);
1718
1719 i386_addr = (size_t)potential_chunk_p;
1720
1721 /*
1722 * Load the chunk pointer into eax.
1723 * If it is NULL (zero), then skip the following jump.
1724 * Add chunk_base_address to eax, and jump to eax.
1725 */
1726
1727 /* a1 78 56 34 12 mov 0x12345678,%eax */
1728 /* 83 f8 00 cmp $0x0,%eax */
1729 /* 75 01 jne <okaa> */
1730 /* c3 ret */
1731 *a++ = 0xa1;
1732 *a++ = i386_addr; *a++ = i386_addr >> 8;
1733 *a++ = i386_addr >> 16; *a++ = i386_addr >> 24;
1734 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1735 *a++ = 0x75; *a++ = 0x01;
1736 *a++ = 0xc3;
1737
1738 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1739 /* ff e0 jmp *%eax */
1740 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1741 *a++ = 0x03; *a++ = 0x86;
1742 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1743 *a++ = 0xff; *a++ = 0xe0;
1744 }
1745 }
1746
1747 if (skip != NULL)
1748 *skip = (size_t)a - (size_t)skip - 1;
1749
1750 *addrp = a;
1751 return 1;
1752 }
1753
1754
1755 /*
1756 * bintrans_write_instruction__loadstore():
1757 */
1758 static int bintrans_write_instruction__loadstore(struct memory *mem,
1759 unsigned char **addrp, int rt, int imm, int rs,
1760 int instruction_type, int bigendian, int do_alignment_check)
1761 {
1762 unsigned char *a, *retfail, *generic64bit, *doloadstore,
1763 *okret0, *okret1, *okret2, *skip;
1764 int ofs, alignment, load=0, unaligned=0;
1765
1766 /* TODO: Not yet: */
1767 if (instruction_type == HI6_LQ_MDMX || instruction_type == HI6_SQ)
1768 return 0;
1769
1770 switch (instruction_type) {
1771 case HI6_LQ_MDMX:
1772 case HI6_LDL:
1773 case HI6_LDR:
1774 case HI6_LD:
1775 case HI6_LWU:
1776 case HI6_LWL:
1777 case HI6_LWR:
1778 case HI6_LW:
1779 case HI6_LHU:
1780 case HI6_LH:
1781 case HI6_LBU:
1782 case HI6_LB:
1783 load = 1;
1784 if (rt == 0)
1785 return 0;
1786 }
1787
1788 switch (instruction_type) {
1789 case HI6_LWL:
1790 case HI6_LWR:
1791 case HI6_LDL:
1792 case HI6_LDR:
1793 case HI6_SWL:
1794 case HI6_SWR:
1795 case HI6_SDL:
1796 case HI6_SDR:
1797 unaligned = 1;
1798 }
1799
1800 /* TODO: Not yet: */
1801 if (bigendian && unaligned)
1802 return 0;
1803
1804 a = *addrp;
1805
1806 if (mem->bintrans_32bit_only)
1807 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1808 else
1809 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1810
1811 if (imm != 0) {
1812 if (imm & 0x8000) {
1813 /* 05 34 f2 ff ff add $0xfffff234,%eax */
1814 /* 83 d2 ff adc $0xffffffff,%edx */
1815 *a++ = 5;
1816 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
1817 if (!mem->bintrans_32bit_only) {
1818 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1819 }
1820 } else {
1821 /* 05 34 12 00 00 add $0x1234,%eax */
1822 /* 83 d2 00 adc $0x0,%edx */
1823 *a++ = 5;
1824 *a++ = imm; *a++ = imm >> 8; *a++ = 0; *a++ = 0;
1825 if (!mem->bintrans_32bit_only) {
1826 *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
1827 }
1828 }
1829 }
1830
1831 alignment = 0;
1832 switch (instruction_type) {
1833 case HI6_LQ_MDMX:
1834 case HI6_SQ:
1835 alignment = 15;
1836 break;
1837 case HI6_LD:
1838 case HI6_LDL:
1839 case HI6_LDR:
1840 case HI6_SD:
1841 case HI6_SDL:
1842 case HI6_SDR:
1843 alignment = 7;
1844 break;
1845 case HI6_LW:
1846 case HI6_LWL:
1847 case HI6_LWR:
1848 case HI6_LWU:
1849 case HI6_SW:
1850 case HI6_SWL:
1851 case HI6_SWR:
1852 alignment = 3;
1853 break;
1854 case HI6_LH:
1855 case HI6_LHU:
1856 case HI6_SH:
1857 alignment = 1;
1858 break;
1859 }
1860
1861 if (unaligned) {
1862 /*
1863 * Perform the actual load/store from an
1864 * aligned address.
1865 *
1866 * 83 e0 fc and $0xfffffffc,%eax
1867 */
1868 *a++ = 0x83; *a++ = 0xe0; *a++ = 0xff - alignment;
1869 } else if (alignment > 0 && do_alignment_check) {
1870 unsigned char *alignskip;
1871 /*
1872 * Check alignment:
1873 *
1874 * 89 c3 mov %eax,%ebx
1875 * 83 e3 01 and $0x1,%ebx
1876 * 74 01 jz <ok>
1877 * c3 ret
1878 */
1879 *a++ = 0x89; *a++ = 0xc3;
1880 *a++ = 0x83; *a++ = 0xe3; *a++ = alignment;
1881 *a++ = 0x74; alignskip = a; *a++ = 0x00;
1882 bintrans_write_chunkreturn_fail(&a);
1883 *alignskip = (size_t)a - (size_t)alignskip - 1;
1884 }
1885
1886
1887 /* Here, edx:eax = vaddr */
1888
1889 if (mem->bintrans_32bit_only) {
1890 /* ebx = vaddr >> 12; */
1891 *a++ = 0x89; *a++ = 0xc3; /* mov %eax, %ebx */
1892 *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x0c; /* shr $12, %ebx */
1893
1894 if (load) {
1895 /* ecx = cpu->cd.mips.host_load */
1896 *a++ = 0x8b; *a++ = 0x8e; *a++ = ofs_h_l & 255;
1897 *a++ = (ofs_h_l >> 8) & 255;
1898 *a++ = (ofs_h_l >> 16) & 255; *a++ = (ofs_h_l >> 24) & 255;
1899 } else {
1900 /* ecx = cpu->cd.mips.host_store */
1901 *a++ = 0x8b; *a++ = 0x8e; *a++ = ofs_h_s & 255;
1902 *a++ = (ofs_h_s >> 8) & 255;
1903 *a++ = (ofs_h_s >> 16) & 255; *a++ = (ofs_h_s >> 24) & 255;
1904 }
1905
1906 /* ecx = host_load[a] (or host_store[a]) */
1907 *a++ = 0x8b; *a++ = 0x0c; *a++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
1908
1909 /*
1910 * ecx = NULL? Then return with failure.
1911 *
1912 * 83 f9 00 cmp $0x0,%ecx
1913 * 75 01 jne <okzzz>
1914 */
1915 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1916 *a++ = 0x75; retfail = a; *a++ = 0x00;
1917 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1918 *retfail = (size_t)a - (size_t)retfail - 1;
1919
1920 /*
1921 * eax = offset within page = vaddr & 0xfff
1922 * ecx = host address ( = host page + offset)
1923 *
1924 * 25 ff 0f 00 00 and $0xfff,%eax
1925 * 01 c1 add %eax,%ecx
1926 */
1927 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1928 *a++ = 0x01; *a++ = 0xc1;
1929 } else {
1930 /*
1931 * If the load/store address has the top 32 bits set to
1932 * 0x00000000 or 0xffffffff, then we can use the 32-bit
1933 * lookup tables:
1934 *
1935
1936 TODO: top 33 bits!!!!!!!
1937
1938 * 83 fa 00 cmp $0x0,%edx
1939 * 74 05 je <ok32>
1940 * 83 fa ff cmp $0xffffffff,%edx
1941 * 75 01 jne <not32>
1942 */
1943 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1944 *a++ = 0x74; *a++ = 0x05;
1945 *a++ = 0x83; *a++ = 0xfa; *a++ = 0xff;
1946 *a++ = 0x75; generic64bit = a; *a++ = 0x01;
1947
1948 /* Call the quick lookup routine: */
1949 if (load)
1950 ofs = (size_t)bintrans_load_32bit;
1951 else
1952 ofs = (size_t)bintrans_store_32bit;
1953 ofs = ofs - ((size_t)a + 5);
1954 *a++ = 0xe8; *a++ = ofs; *a++ = ofs >> 8;
1955 *a++ = ofs >> 16; *a++ = ofs >> 24;
1956
1957 /*
1958 * ecx = NULL? Then return with failure.
1959 *
1960 * 83 f9 00 cmp $0x0,%ecx
1961 * 75 01 jne <okzzz>
1962 */
1963 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1964 *a++ = 0x75; retfail = a; *a++ = 0x00;
1965 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1966 *retfail = (size_t)a - (size_t)retfail - 1;
1967
1968 /*
1969 * eax = offset within page = vaddr & 0xfff
1970 * ecx = host address ( = host page + offset)
1971 *
1972 * 25 ff 0f 00 00 and $0xfff,%eax
1973 * 01 c1 add %eax,%ecx
1974 */
1975 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1976 *a++ = 0x01; *a++ = 0xc1;
1977
1978 *a++ = 0xeb; doloadstore = a; *a++ = 0x01;
1979
1980
1981 /* TODO: The stuff above is so similar to the pure 32-bit
1982 case that it should be factored out. */
1983
1984
1985 *generic64bit = (size_t)a - (size_t)generic64bit - 1;
1986
1987 /*
1988 * 64-bit generic case:
1989 */
1990
1991 /* push writeflag */
1992 *a++ = 0x6a; *a++ = load? 0 : 1;
1993
1994 /* push vaddr (edx:eax) */
1995 *a++ = 0x52; *a++ = 0x50;
1996
1997 /* push cpu (esi) */
1998 *a++ = 0x56;
1999
2000 /* eax = points to the right function */
2001 ofs = ((size_t)&dummy_cpu.cd.mips.fast_vaddr_to_hostaddr) - (size_t)&dummy_cpu;
2002 *a++ = 0x8b; *a++ = 0x86;
2003 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2004
2005 /* ff d0 call *%eax */
2006 *a++ = 0xff; *a++ = 0xd0;
2007
2008 /* 83 c4 08 add $0x10,%esp */
2009 *a++ = 0x83; *a++ = 0xc4; *a++ = 0x10;
2010
2011 /* If eax is NULL, then return. */
2012 /* 83 f8 00 cmp $0x0,%eax */
2013 /* 75 01 jne 1cd <okjump> */
2014 /* c3 ret */
2015 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
2016 *a++ = 0x75; retfail = a; *a++ = 0x00;
2017 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2018 *retfail = (size_t)a - (size_t)retfail - 1;
2019
2020 /* 89 c1 mov %eax,%ecx */
2021 *a++ = 0x89; *a++ = 0xc1;
2022
2023 *doloadstore = (size_t)a - (size_t)doloadstore - 1;
2024 }
2025
2026
2027 if (!load) {
2028 if (rt == MIPS_GPR_ZERO) {
2029 switch (alignment) {
2030 case 7: *a++ = 0x31; *a++ = 0xd2; /* edx = 0 */
2031 case 3:
2032 case 1: *a++ = 0x31; *a++ = 0xc0; break; /* eax = 0 */
2033 case 0: *a++ = 0xb0; *a++ = 0x00; break; /* al = 0 */
2034 default:fatal("todo: zero\n"); exit(1);
2035 }
2036 } else {
2037 if (alignment >= 7)
2038 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2039 else
2040 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2041 }
2042 }
2043
2044 switch (instruction_type) {
2045 case HI6_LD:
2046 if (bigendian) {
2047 /* 8b 11 mov (%ecx),%edx */
2048 /* 8b 41 04 mov 0x4(%ecx),%eax */
2049 /* 0f c8 bswap %eax */
2050 /* 0f ca bswap %edx */
2051 *a++ = 0x8b; *a++ = 0x11;
2052 *a++ = 0x8b; *a++ = 0x41; *a++ = 0x04;
2053 *a++ = 0x0f; *a++ = 0xc8;
2054 *a++ = 0x0f; *a++ = 0xca;
2055 } else {
2056 /* 8b 01 mov (%ecx),%eax */
2057 /* 8b 51 04 mov 0x4(%ecx),%edx */
2058 *a++ = 0x8b; *a++ = 0x01;
2059 *a++ = 0x8b; *a++ = 0x51; *a++ = 0x04;
2060 }
2061 break;
2062 case HI6_LWU:
2063 /* 8b 01 mov (%ecx),%eax */
2064 /* 0f c8 bswap %eax (big endian) */
2065 /* 31 d2 xor %edx,%edx */
2066 *a++ = 0x8b; *a++ = 0x01;
2067 if (bigendian) {
2068 *a++ = 0x0f; *a++ = 0xc8;
2069 }
2070 *a++ = 0x31; *a++ = 0xd2;
2071 break;
2072 case HI6_LW:
2073 /* 8b 01 mov (%ecx),%eax */
2074 /* 0f c8 bswap %eax (big endian) */
2075 /* 99 cltd */
2076 *a++ = 0x8b; *a++ = 0x01;
2077 if (bigendian) {
2078 *a++ = 0x0f; *a++ = 0xc8;
2079 }
2080 *a++ = 0x99;
2081 break;
2082 case HI6_LHU:
2083 /* 31 c0 xor %eax,%eax */
2084 /* 66 8b 01 mov (%ecx),%ax */
2085 /* 86 c4 xchg %al,%ah (big endian) */
2086 /* 99 cltd */
2087 *a++ = 0x31; *a++ = 0xc0;
2088 *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2089 if (bigendian) {
2090 *a++ = 0x86; *a++ = 0xc4;
2091 }
2092 *a++ = 0x99;
2093 break;
2094 case HI6_LH:
2095 /* 66 8b 01 mov (%ecx),%ax */
2096 /* 86 c4 xchg %al,%ah (big endian) */
2097 /* 98 cwtl */
2098 /* 99 cltd */
2099 *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2100 if (bigendian) {
2101 *a++ = 0x86; *a++ = 0xc4;
2102 }
2103 *a++ = 0x98;
2104 *a++ = 0x99;
2105 break;
2106 case HI6_LBU:
2107 /* 31 c0 xor %eax,%eax */
2108 /* 8a 01 mov (%ecx),%al */
2109 /* 99 cltd */
2110 *a++ = 0x31; *a++ = 0xc0;
2111 *a++ = 0x8a; *a++ = 0x01;
2112 *a++ = 0x99;
2113 break;
2114 case HI6_LB:
2115 /* 8a 01 mov (%ecx),%al */
2116 /* 66 98 cbtw */
2117 /* 98 cwtl */
2118 /* 99 cltd */
2119 *a++ = 0x8a; *a++ = 0x01;
2120 *a++ = 0x66; *a++ = 0x98;
2121 *a++ = 0x98;
2122 *a++ = 0x99;
2123 break;
2124
2125 case HI6_LWL:
2126 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2127 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2128 *a++ = 5;
2129 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2130 /* 83 e0 03 and $0x03,%eax */
2131 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2132 /* 89 c3 mov %eax,%ebx */
2133 *a++ = 0x89; *a++ = 0xc3;
2134
2135 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2136
2137 /* ALIGNED LOAD: */
2138 /* 8b 11 mov (%ecx),%edx */
2139 *a++ = 0x8b; *a++ = 0x11;
2140
2141 /*
2142 * CASE 0:
2143 * memory = 0x12 0x34 0x56 0x78
2144 * register after lwl: 0x12 0x.. 0x.. 0x..
2145 */
2146 /* 83 fb 00 cmp $0x0,%ebx */
2147 /* 75 01 jne <skip> */
2148 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2149 *a++ = 0x75; skip = a; *a++ = 0x01;
2150
2151 /* c1 e2 18 shl $0x18,%edx */
2152 /* 25 ff ff ff 00 and $0xffffff,%eax */
2153 /* 09 d0 or %edx,%eax */
2154 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x18;
2155 *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2156 *a++ = 0x09; *a++ = 0xd0;
2157
2158 /* eb 00 jmp <okret> */
2159 *a++ = 0xeb; okret0 = a; *a++ = 0;
2160
2161 *skip = (size_t)a - (size_t)skip - 1;
2162
2163 /*
2164 * CASE 1:
2165 * memory = 0x12 0x34 0x56 0x78
2166 * register after lwl: 0x34 0x12 0x.. 0x..
2167 */
2168 /* 83 fb 01 cmp $0x1,%ebx */
2169 /* 75 01 jne <skip> */
2170 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2171 *a++ = 0x75; skip = a; *a++ = 0x01;
2172
2173 /* c1 e2 10 shl $0x10,%edx */
2174 /* 25 ff ff 00 00 and $0xffff,%eax */
2175 /* 09 d0 or %edx,%eax */
2176 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x10;
2177 *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2178 *a++ = 0x09; *a++ = 0xd0;
2179
2180 /* eb 00 jmp <okret> */
2181 *a++ = 0xeb; okret1 = a; *a++ = 0;
2182
2183 *skip = (size_t)a - (size_t)skip - 1;
2184
2185 /*
2186 * CASE 2:
2187 * memory = 0x12 0x34 0x56 0x78
2188 * register after lwl: 0x56 0x34 0x12 0x..
2189 */
2190 /* 83 fb 02 cmp $0x2,%ebx */
2191 /* 75 01 jne <skip> */
2192 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2193 *a++ = 0x75; skip = a; *a++ = 0x01;
2194
2195 /* c1 e2 08 shl $0x08,%edx */
2196 /* 25 ff 00 00 00 and $0xff,%eax */
2197 /* 09 d0 or %edx,%eax */
2198 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x08;
2199 *a++ = 0x25; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2200 *a++ = 0x09; *a++ = 0xd0;
2201
2202 /* eb 00 jmp <okret> */
2203 *a++ = 0xeb; okret2 = a; *a++ = 0;
2204
2205 *skip = (size_t)a - (size_t)skip - 1;
2206
2207 /*
2208 * CASE 3:
2209 * memory = 0x12 0x34 0x56 0x78
2210 * register after lwl: 0x78 0x56 0x34 0x12
2211 */
2212 /* 89 d0 mov %edx,%eax */
2213 *a++ = 0x89; *a++ = 0xd0;
2214
2215 /* okret: */
2216 *okret0 = (size_t)a - (size_t)okret0 - 1;
2217 *okret1 = (size_t)a - (size_t)okret1 - 1;
2218 *okret2 = (size_t)a - (size_t)okret2 - 1;
2219
2220 /* 99 cltd */
2221 *a++ = 0x99;
2222 break;
2223
2224 case HI6_LWR:
2225 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2226 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2227 *a++ = 5;
2228 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2229 /* 83 e0 03 and $0x03,%eax */
2230 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2231 /* 89 c3 mov %eax,%ebx */
2232 *a++ = 0x89; *a++ = 0xc3;
2233
2234 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2235
2236 /* ALIGNED LOAD: */
2237 /* 8b 11 mov (%ecx),%edx */
2238 *a++ = 0x8b; *a++ = 0x11;
2239
2240 /*
2241 * CASE 0:
2242 * memory = 0x12 0x34 0x56 0x78
2243 * register after lwr: 0x78 0x56 0x34 0x12
2244 */
2245 /* 83 fb 00 cmp $0x0,%ebx */
2246 /* 75 01 jne <skip> */
2247 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2248 *a++ = 0x75; skip = a; *a++ = 0x01;
2249
2250 /* 89 d0 mov %edx,%eax */
2251 *a++ = 0x89; *a++ = 0xd0;
2252
2253 /* eb 00 jmp <okret> */
2254 *a++ = 0xeb; okret0 = a; *a++ = 0;
2255
2256 *skip = (size_t)a - (size_t)skip - 1;
2257
2258 /*
2259 * CASE 1:
2260 * memory = 0x12 0x34 0x56 0x78
2261 * register after lwr: 0x.. 0x78 0x56 0x34
2262 */
2263 /* 83 fb 01 cmp $0x1,%ebx */
2264 /* 75 01 jne <skip> */
2265 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2266 *a++ = 0x75; skip = a; *a++ = 0x01;
2267
2268 /* c1 ea 08 shr $0x8,%edx */
2269 /* 25 00 00 00 ff and $0xff000000,%eax */
2270 /* 09 d0 or %edx,%eax */
2271 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x08;
2272 *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2273 *a++ = 0x09; *a++ = 0xd0;
2274
2275 /* eb 00 jmp <okret> */
2276 *a++ = 0xeb; okret1 = a; *a++ = 0;
2277
2278 *skip = (size_t)a - (size_t)skip - 1;
2279
2280 /*
2281 * CASE 2:
2282 * memory = 0x12 0x34 0x56 0x78
2283 * register after lwr: 0x.. 0x.. 0x78 0x56
2284 */
2285 /* 83 fb 02 cmp $0x2,%ebx */
2286 /* 75 01 jne <skip> */
2287 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2288 *a++ = 0x75; skip = a; *a++ = 0x01;
2289
2290 /* c1 ea 10 shr $0x10,%edx */
2291 /* 25 00 00 ff ff and $0xffff0000,%eax */
2292 /* 09 d0 or %edx,%eax */
2293 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x10;
2294 *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2295 *a++ = 0x09; *a++ = 0xd0;
2296
2297 /* eb 00 jmp <okret> */
2298 *a++ = 0xeb; okret2 = a; *a++ = 0;
2299
2300 *skip = (size_t)a - (size_t)skip - 1;
2301
2302 /*
2303 * CASE 3:
2304 * memory = 0x12 0x34 0x56 0x78
2305 * register after lwr: 0x.. 0x.. 0x.. 0x78
2306 */
2307 /* c1 ea 18 shr $0x18,%edx */
2308 /* 25 00 ff ff ff and $0xffffff00,%eax */
2309 /* 09 d0 or %edx,%eax */
2310 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x18;
2311 *a++ = 0x25; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2312 *a++ = 0x09; *a++ = 0xd0;
2313
2314 /* okret: */
2315 *okret0 = (size_t)a - (size_t)okret0 - 1;
2316 *okret1 = (size_t)a - (size_t)okret1 - 1;
2317 *okret2 = (size_t)a - (size_t)okret2 - 1;
2318
2319 /* 99 cltd */
2320 *a++ = 0x99;
2321 break;
2322
2323 case HI6_SD:
2324 if (bigendian) {
2325 /* 0f c8 bswap %eax */
2326 /* 0f ca bswap %edx */
2327 /* 89 11 mov %edx,(%ecx) */
2328 /* 89 41 04 mov %eax,0x4(%ecx) */
2329 *a++ = 0x0f; *a++ = 0xc8;
2330 *a++ = 0x0f; *a++ = 0xca;
2331 *a++ = 0x89; *a++ = 0x11;
2332 *a++ = 0x89; *a++ = 0x41; *a++ = 0x04;
2333 } else {
2334 /* 89 01 mov %eax,(%ecx) */
2335 /* 89 51 04 mov %edx,0x4(%ecx) */
2336 *a++ = 0x89; *a++ = 0x01;
2337 *a++ = 0x89; *a++ = 0x51; *a++ = 0x04;
2338 }
2339 break;
2340 case HI6_SW:
2341 /* 0f c8 bswap %eax (big endian) */
2342 if (bigendian) {
2343 *a++ = 0x0f; *a++ = 0xc8;
2344 }
2345 /* 89 01 mov %eax,(%ecx) */
2346 *a++ = 0x89; *a++ = 0x01;
2347 break;
2348 case HI6_SH:
2349 /* 86 c4 xchg %al,%ah (big endian) */
2350 if (bigendian) {
2351 *a++ = 0x86; *a++ = 0xc4;
2352 }
2353 /* 66 89 01 mov %ax,(%ecx) */
2354 *a++ = 0x66; *a++ = 0x89; *a++ = 0x01;
2355 break;
2356 case HI6_SB:
2357 /* 88 01 mov %al,(%ecx) */
2358 *a++ = 0x88; *a++ = 0x01;
2359 break;
2360
2361 case HI6_SWL:
2362 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2363 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2364 *a++ = 5;
2365 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2366 /* 83 e0 03 and $0x03,%eax */
2367 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2368 /* 89 c3 mov %eax,%ebx */
2369 *a++ = 0x89; *a++ = 0xc3;
2370
2371 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2372
2373 /* ALIGNED LOAD: */
2374 /* 8b 11 mov (%ecx),%edx */
2375 *a++ = 0x8b; *a++ = 0x11;
2376
2377 /*
2378 * CASE 0:
2379 * memory (edx): 0x12 0x34 0x56 0x78
2380 * register (eax): 0x89abcdef
2381 * mem after swl: 0x89 0x.. 0x.. 0x..
2382 */
2383 /* 83 fb 00 cmp $0x0,%ebx */
2384 /* 75 01 jne <skip> */
2385 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2386 *a++ = 0x75; skip = a; *a++ = 0x01;
2387
2388 /* 81 e2 00 ff ff ff and $0xffffff00,%edx */
2389 /* c1 e8 18 shr $0x18,%eax */
2390 /* 09 d0 or %edx,%eax */
2391 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2392 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x18;
2393 *a++ = 0x09; *a++ = 0xd0;
2394
2395 /* eb 00 jmp <okret> */
2396 *a++ = 0xeb; okret0 = a; *a++ = 0;
2397
2398 *skip = (size_t)a - (size_t)skip - 1;
2399
2400 /*
2401 * CASE 1:
2402 * memory (edx): 0x12 0x34 0x56 0x78
2403 * register (eax): 0x89abcdef
2404 * mem after swl: 0xab 0x89 0x.. 0x..
2405 */
2406 /* 83 fb 01 cmp $0x1,%ebx */
2407 /* 75 01 jne <skip> */
2408 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2409 *a++ = 0x75; skip = a; *a++ = 0x01;
2410
2411 /* 81 e2 00 00 ff ff and $0xffff0000,%edx */
2412 /* c1 e8 10 shr $0x10,%eax */
2413 /* 09 d0 or %edx,%eax */
2414 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2415 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x10;
2416 *a++ = 0x09; *a++ = 0xd0;
2417
2418 /* eb 00 jmp <okret> */
2419 *a++ = 0xeb; okret1 = a; *a++ = 0;
2420
2421 *skip = (size_t)a - (size_t)skip - 1;
2422
2423 /*
2424 * CASE 2:
2425 * memory (edx): 0x12 0x34 0x56 0x78
2426 * register (eax): 0x89abcdef
2427 * mem after swl: 0xcd 0xab 0x89 0x..
2428 */
2429 /* 83 fb 02 cmp $0x2,%ebx */
2430 /* 75 01 jne <skip> */
2431 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2432 *a++ = 0x75; skip = a; *a++ = 0x01;
2433
2434 /* 81 e2 00 00 00 ff and $0xff000000,%edx */
2435 /* c1 e8 08 shr $0x08,%eax */
2436 /* 09 d0 or %edx,%eax */
2437 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2438 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x08;
2439 *a++ = 0x09; *a++ = 0xd0;
2440
2441 /* eb 00 jmp <okret> */
2442 *a++ = 0xeb; okret2 = a; *a++ = 0;
2443
2444 *skip = (size_t)a - (size_t)skip - 1;
2445
2446 /*
2447 * CASE 3:
2448 * memory (edx): 0x12 0x34 0x56 0x78
2449 * register (eax): 0x89abcdef
2450 * mem after swl: 0xef 0xcd 0xab 0x89
2451 */
2452 /* eax = eax :-) */
2453
2454 /* okret: */
2455 *okret0 = (size_t)a - (size_t)okret0 - 1;
2456 *okret1 = (size_t)a - (size_t)okret1 - 1;
2457 *okret2 = (size_t)a - (size_t)okret2 - 1;
2458
2459 /* Store back to memory: */
2460 /* 89 01 mov %eax,(%ecx) */
2461 *a++ = 0x89; *a++ = 0x01;
2462 break;
2463
2464 case HI6_SWR:
2465 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2466 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2467 *a++ = 5;
2468 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2469 /* 83 e0 03 and $0x03,%eax */
2470 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2471 /* 89 c3 mov %eax,%ebx */
2472 *a++ = 0x89; *a++ = 0xc3;
2473
2474 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2475
2476 /* ALIGNED LOAD: */
2477 /* 8b 11 mov (%ecx),%edx */
2478 *a++ = 0x8b; *a++ = 0x11;
2479
2480 /*
2481 * CASE 0:
2482 * memory (edx): 0x12 0x34 0x56 0x78
2483 * register (eax): 0x89abcdef
2484 * mem after swr: 0xef 0xcd 0xab 0x89
2485 */
2486 /* 83 fb 00 cmp $0x0,%ebx */
2487 /* 75 01 jne <skip> */
2488 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2489 *a++ = 0x75; skip = a; *a++ = 0x01;
2490
2491 /* eax = eax, so do nothing */
2492
2493 /* eb 00 jmp <okret> */
2494 *a++ = 0xeb; okret0 = a; *a++ = 0;
2495
2496 *skip = (size_t)a - (size_t)skip - 1;
2497
2498 /*
2499 * CASE 1:
2500 * memory (edx): 0x12 0x34 0x56 0x78
2501 * register (eax): 0x89abcdef
2502 * mem after swr: 0x12 0xef 0xcd 0xab
2503 */
2504 /* 83 fb 01 cmp $0x1,%ebx */
2505 /* 75 01 jne <skip> */
2506 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2507 *a++ = 0x75; skip = a; *a++ = 0x01;
2508
2509 /* 81 e2 ff 00 00 00 and $0x000000ff,%edx */
2510 /* c1 e0 08 shl $0x08,%eax */
2511 /* 09 d0 or %edx,%eax */
2512 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2513 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x08;
2514 *a++ = 0x09; *a++ = 0xd0;
2515
2516 /* eb 00 jmp <okret> */
2517 *a++ = 0xeb; okret1 = a; *a++ = 0;
2518
2519 *skip = (size_t)a - (size_t)skip - 1;
2520
2521 /*
2522 * CASE 2:
2523 * memory (edx): 0x12 0x34 0x56 0x78
2524 * register (eax): 0x89abcdef
2525 * mem after swr: 0x12 0x34 0xef 0xcd
2526 */
2527 /* 83 fb 02 cmp $0x2,%ebx */
2528 /* 75 01 jne <skip> */
2529 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2530 *a++ = 0x75; skip = a; *a++ = 0x01;
2531
2532 /* 81 e2 ff ff 00 00 and $0x0000ffff,%edx */
2533 /* c1 e0 10 shl $0x10,%eax */
2534 /* 09 d0 or %edx,%eax */
2535 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2536 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x10;
2537 *a++ = 0x09; *a++ = 0xd0;
2538
2539 /* eb 00 jmp <okret> */
2540 *a++ = 0xeb; okret2 = a; *a++ = 0;
2541
2542 *skip = (size_t)a - (size_t)skip - 1;
2543
2544 /*
2545 * CASE 3:
2546 * memory (edx): 0x12 0x34 0x56 0x78
2547 * register (eax): 0x89abcdef
2548 * mem after swr: 0x12 0x34 0x56 0xef
2549 */
2550 /* 81 e2 ff ff ff 00 and $0x00ffffff,%edx */
2551 /* c1 e0 18 shl $0x18,%eax */
2552 /* 09 d0 or %edx,%eax */
2553 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2554 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x18;
2555 *a++ = 0x09; *a++ = 0xd0;
2556
2557
2558 /* okret: */
2559 *okret0 = (size_t)a - (size_t)okret0 - 1;
2560 *okret1 = (size_t)a - (size_t)okret1 - 1;
2561 *okret2 = (size_t)a - (size_t)okret2 - 1;
2562
2563 /* Store back to memory: */
2564 /* 89 01 mov %eax,(%ecx) */
2565 *a++ = 0x89; *a++ = 0x01;
2566 break;
2567
2568 default:
2569 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2570 }
2571
2572 if (load && rt != 0)
2573 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2574
2575 *addrp = a;
2576 bintrans_write_pc_inc(addrp);
2577 return 1;
2578 }
2579
2580
2581 /*
2582 * bintrans_write_instruction__tlb_rfe_etc():
2583 */
2584 static int bintrans_write_instruction__tlb_rfe_etc(unsigned char **addrp,
2585 int itype)
2586 {
2587 unsigned char *a;
2588 int ofs;
2589
2590 switch (itype) {
2591 case CALL_TLBP:
2592 case CALL_TLBR:
2593 case CALL_TLBWR:
2594 case CALL_TLBWI:
2595 case CALL_RFE:
2596 case CALL_ERET:
2597 case CALL_SYSCALL:
2598 case CALL_BREAK:
2599 break;
2600 default:
2601 return 0;
2602 }
2603
2604 a = *addrp;
2605
2606 /* Put back PC into the cpu struct, both as pc and pc_last */
2607 *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc&255;
2608 *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2609 *a++ = (ofs_pc>>24)&255; /* mov %edi,pc(%esi) */
2610
2611 *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc_last&255;
2612 *a++ = (ofs_pc_last>>8)&255; *a++ = (ofs_pc_last>>16)&255;
2613 *a++ = (ofs_pc_last>>24)&255; /* mov %edi,pc_last(%esi) */
2614
2615 /* ... and make sure that the high 32 bits are ALSO in pc_last: */
2616 /* 8b 86 38 12 00 00 mov 0x1238(%esi),%eax */
2617 ofs = ofs_pc + 4;
2618 *a++ = 0x8b; *a++ = 0x86; *a++ = ofs&255;
2619 *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2620 *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2621
2622 /* 89 86 34 12 00 00 mov %eax,0x1234(%esi) */
2623 ofs = ofs_pc_last + 4;
2624 *a++ = 0x89; *a++ = 0x86; *a++ = ofs&255;
2625 *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2626 *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2627
2628 switch (itype) {
2629 case CALL_TLBP:
2630 case CALL_TLBR:
2631 /* push readflag */
2632 *a++ = 0x6a; *a++ = (itype == CALL_TLBR);
2633 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbpr) - (size_t)&dummy_cpu;
2634 break;
2635 case CALL_TLBWR:
2636 case CALL_TLBWI:
2637 /* push randomflag */
2638 *a++ = 0x6a; *a++ = (itype == CALL_TLBWR);
2639 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbwri) - (size_t)&dummy_cpu;
2640 break;
2641 case CALL_SYSCALL:
2642 case CALL_BREAK:
2643 /* push randomflag */
2644 *a++ = 0x6a; *a++ = (itype == CALL_BREAK? EXCEPTION_BP : EXCEPTION_SYS);
2645 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_simple_exception) - (size_t)&dummy_cpu;
2646 break;
2647 case CALL_RFE:
2648 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_rfe) - (size_t)&dummy_cpu;
2649 break;
2650 case CALL_ERET:
2651 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_eret) - (size_t)&dummy_cpu;
2652 break;
2653 }
2654
2655 /* push cpu (esi) */
2656 *a++ = 0x56;
2657
2658 /* eax = points to the right function */
2659 *a++ = 0x8b; *a++ = 0x86;
2660 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2661
2662 /* ff d0 call *%eax */
2663 *a++ = 0xff; *a++ = 0xd0;
2664
2665 switch (itype) {
2666 case CALL_RFE:
2667 case CALL_ERET:
2668 /* 83 c4 04 add $4,%esp */
2669 *a++ = 0x83; *a++ = 0xc4; *a++ = 4;
2670 break;
2671 default:
2672 /* 83 c4 08 add $8,%esp */
2673 *a++ = 0x83; *a++ = 0xc4; *a++ = 8;
2674 break;
2675 }
2676
2677 /* Load PC from the cpu struct. */
2678 *a++ = 0x8b; *a++ = 0xbe; *a++ = ofs_pc&255;
2679 *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2680 *a++ = (ofs_pc>>24)&255; /* mov pc(%esi),%edi */
2681
2682 *addrp = a;
2683
2684 switch (itype) {
2685 case CALL_ERET:
2686 case CALL_SYSCALL:
2687 case CALL_BREAK:
2688 break;
2689 default:
2690 bintrans_write_pc_inc(addrp);
2691 }
2692
2693 return 1;
2694 }
2695
2696
2697 /*
2698 * bintrans_backend_init():
2699 *
2700 * This is neccessary for broken GCC 2.x. (For GCC 3.x, this wouldn't be
2701 * neccessary, and the old code would have worked.)
2702 */
2703 static void bintrans_backend_init(void)
2704 {
2705 int size;
2706 unsigned char *p;
2707
2708
2709 /* "runchunk": */
2710 size = 64; /* NOTE: This MUST be enough, or we fail */
2711 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2712 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2713
2714 /* If mmap() failed, try malloc(): */
2715 if (p == NULL) {
2716 p = malloc(size);
2717 if (p == NULL) {
2718 fprintf(stderr, "bintrans_backend_init():"
2719 " out of memory\n");
2720 exit(1);
2721 }
2722 }
2723
2724 bintrans_runchunk = (void *)p;
2725
2726 *p++ = 0x57; /* push %edi */
2727 *p++ = 0x56; /* push %esi */
2728 *p++ = 0x55; /* push %ebp */
2729 *p++ = 0x53; /* push %ebx */
2730
2731 /*
2732 * In all translated code, esi points to the cpu struct, and
2733 * ebp is the nr of executed (translated) instructions.
2734 */
2735
2736 /* 0=ebx, 4=ebp, 8=esi, 0xc=edi, 0x10=retaddr, 0x14=arg0, 0x18=arg1 */
2737
2738 /* mov 0x8(%esp,1),%esi */
2739 *p++ = 0x8b; *p++ = 0x74; *p++ = 0x24; *p++ = 0x14;
2740
2741 /* mov nr_instr(%esi),%ebp */
2742 *p++ = 0x8b; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2743 *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2744
2745 /* mov pc(%esi),%edi */
2746 *p++ = 0x8b; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2747 *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2748
2749 /* call *0x18(%esp,1) */
2750 *p++ = 0xff; *p++ = 0x54; *p++ = 0x24; *p++ = 0x18;
2751
2752 /* mov %ebp,0x1234(%esi) */
2753 *p++ = 0x89; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2754 *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2755
2756 /* mov %edi,pc(%esi) */
2757 *p++ = 0x89; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2758 *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2759
2760 *p++ = 0x5b; /* pop %ebx */
2761 *p++ = 0x5d; /* pop %ebp */
2762 *p++ = 0x5e; /* pop %esi */
2763 *p++ = 0x5f; /* pop %edi */
2764 *p++ = 0xc3; /* ret */
2765
2766
2767
2768 /* "jump_to_32bit_pc": */
2769 size = 128; /* NOTE: This MUST be enough, or we fail */
2770 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2771 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2772
2773 /* If mmap() failed, try malloc(): */
2774 if (p == NULL) {
2775 p = malloc(size);
2776 if (p == NULL) {
2777 fprintf(stderr, "bintrans_backend_init():"
2778 " out of memory\n");
2779 exit(1);
2780 }
2781 }
2782
2783 bintrans_jump_to_32bit_pc = (void *)p;
2784
2785 /* Don't execute too many instructions. */
2786 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
2787 /* 7c 01 jl <okk> */
2788 /* c3 ret */
2789 *p++ = 0x81; *p++ = 0xfd; *p++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
2790 *p++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *p++ = 0; *p++ = 0;
2791 *p++ = 0x7c; *p++ = 0x01;
2792 *p++ = 0xc3;
2793
2794 /*
2795 * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
2796 *
2797 * 89 c3 mov %eax,%ebx
2798 * c1 eb 14 shr $20,%ebx
2799 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2800 */
2801 *p++ = 0x89; *p++ = 0xc3;
2802 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x14;
2803 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2804
2805 /*
2806 * ecx = vaddr_to_hostaddr_table0
2807 *
2808 * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
2809 */
2810 *p++ = 0x8b; *p++ = 0x8e;
2811 *p++ = ofs_tabl0 & 255; *p++ = (ofs_tabl0 >> 8) & 255;
2812 *p++ = (ofs_tabl0 >> 16) & 255; *p++ = (ofs_tabl0 >> 24) & 255;
2813
2814 /*
2815 * ecx = vaddr_to_hostaddr_table0[a]
2816 *
2817 * 8b 0c 19 mov (%ecx,%ebx),%ecx
2818 */
2819 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x19;
2820
2821 /*
2822 * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
2823 *
2824 * 89 c3 mov %eax,%ebx
2825 * c1 eb 0a shr $10,%ebx
2826 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2827 */
2828 *p++ = 0x89; *p++ = 0xc3;
2829 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0a;
2830 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2831
2832 /*
2833 * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
2834 *
2835 * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
2836 */
2837 *p++ = 0x8b; *p++ = 0x8c; *p++ = 0x19; *p++ = ofs_chunks & 255;
2838 *p++ = (ofs_chunks >> 8) & 255; *p++ = (ofs_chunks >> 16) & 255;
2839 *p++ = (ofs_chunks >> 24) & 255;
2840
2841 /*
2842 * ecx = NULL? Then return with failure.
2843 *
2844 * 83 f9 00 cmp $0x0,%ecx
2845 * 75 01 jne <okzzz>
2846 */
2847 *p++ = 0x83; *p++ = 0xf9; *p++ = 0x00;
2848 *p++ = 0x75; *p++ = 0x01;
2849 *p++ = 0xc3; /* TODO: failure? */
2850
2851 /*
2852 * 25 fc 0f 00 00 and $0xffc,%eax
2853 * 01 c1 add %eax,%ecx
2854 *
2855 * 8b 01 mov (%ecx),%eax
2856 *
2857 * 83 f8 00 cmp $0x0,%eax
2858 * 75 01 jne <ok>
2859 * c3 ret
2860 */
2861 *p++ = 0x25; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2862 *p++ = 0x01; *p++ = 0xc1;
2863
2864 *p++ = 0x8b; *p++ = 0x01;
2865
2866 *p++ = 0x83; *p++ = 0xf8; *p++ = 0x00;
2867 *p++ = 0x75; *p++ = 0x01;
2868 *p++ = 0xc3; /* TODO: failure? */
2869
2870 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
2871 /* ff e0 jmp *%eax */
2872 *p++ = 0x03; *p++ = 0x86; *p++ = ofs_chunkbase & 255;
2873 *p++ = (ofs_chunkbase >> 8) & 255; *p++ = (ofs_chunkbase >> 16) & 255;
2874 *p++ = (ofs_chunkbase >> 24) & 255;
2875 *p++ = 0xff; *p++ = 0xe0;
2876
2877
2878
2879 /* "load_32bit": */
2880 size = 48; /* NOTE: This MUST be enough, or we fail */
2881 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2882 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2883
2884 /* If mmap() failed, try malloc(): */
2885 if (p == NULL) {
2886 p = malloc(size);
2887 if (p == NULL) {
2888 fprintf(stderr, "bintrans_backend_init():"
2889 " out of memory\n");
2890 exit(1);
2891 }
2892 }
2893
2894 bintrans_load_32bit = (void *)p;
2895
2896 /* ebx = vaddr >> 12; */
2897 *p++ = 0x89; *p++ = 0xc3; /* mov %eax, %ebx */
2898 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0c; /* shr $12, %ebx */
2899
2900 /* ecx = cpu->cd.mips.host_load */
2901 *p++ = 0x8b; *p++ = 0x8e; *p++ = ofs_h_l & 255;
2902 *p++ = (ofs_h_l >> 8) & 255;
2903 *p++ = (ofs_h_l >> 16) & 255; *p++ = (ofs_h_l >> 24) & 255;
2904
2905 /* ecx = host_load[a] */
2906 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
2907
2908 /* ret */
2909 *p++ = 0xc3;
2910
2911
2912
2913 /* "store_32bit": */
2914 size = 48; /* NOTE: This MUST be enough, or we fail */
2915 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2916 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2917
2918 /* If mmap() failed, try malloc(): */
2919 if (p == NULL) {
2920 p = malloc(size);
2921 if (p == NULL) {
2922 fprintf(stderr, "bintrans_backend_init():"
2923 " out of memory\n");
2924 exit(1);
2925 }
2926 }
2927
2928 bintrans_store_32bit = (void *)p;
2929
2930 /* ebx = vaddr >> 12; */
2931 *p++ = 0x89; *p++ = 0xc3; /* mov %eax, %ebx */
2932 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0c; /* shr $12, %ebx */
2933
2934 /* ecx = cpu->cd.mips.host_store */
2935 *p++ = 0x8b; *p++ = 0x8e; *p++ = ofs_h_s & 255;
2936 *p++ = (ofs_h_s >> 8) & 255;
2937 *p++ = (ofs_h_s >> 16) & 255; *p++ = (ofs_h_s >> 24) & 255;
2938
2939 /* ecx = host_store[a] */
2940 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
2941
2942 /* ret */
2943 *p++ = 0xc3;
2944 }
2945

  ViewVC Help
Powered by ViewVC 1.1.26