/[gxemul]/trunk/src/bintrans_i386.c
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/src/bintrans_i386.c

Parent Directory Parent Directory | Revision Log Revision Log


Revision 12 - (hide annotations)
Mon Oct 8 16:18:38 2007 UTC (16 years, 6 months ago) by dpavlin
File MIME type: text/plain
File size: 85064 byte(s)
++ trunk/HISTORY	(local)
$Id: HISTORY,v 1.905 2005/08/16 09:16:24 debug Exp $
20050628	Continuing the work on the ARM translation engine. end_of_page
		works. Experimenting with load/store translation caches
		(virtual -> physical -> host).
20050629	More ARM stuff (memory access translation cache, mostly). This
		might break a lot of stuff elsewhere, probably some MIPS-
		related translation things.
20050630	Many load/stores are now automatically generated and included
		into cpu_arm_instr.c; 1024 functions in total (!).
		Fixes based on feedback from Alec Voropay: only print 8 hex
		digits instead of 16 in some cases when emulating 32-bit
		machines; similar 8 vs 16 digit fix for breakpoint addresses;
		4Kc has 16 TLB entries, not 48; the MIPS config select1
		register is now printed with "reg ,0".
		Also changing many other occurances of 16 vs 8 digit output.
		Adding cache associativity fields to mips_cpu_types.h; updating
		some other cache fields; making the output of
		mips_cpu_dumpinfo() look nicer.
		Generalizing the bintrans stuff for device accesses to also
		work with the new translation system. (This might also break
		some MIPS things.)
		Adding multi-load/store instructions to the ARM disassembler
		and the translator, and some optimizations of various kinds.
20050701	Adding a simple dev_disk (it can read/write sectors from
		disk images).
20050712	Adding dev_ether (a simple ethernet send/receive device).
		Debugger command "ninstrs" for toggling show_nr_of_instructions
		during runtime.
		Removing the framebuffer logo.
20050713	Continuing on dev_ether.
		Adding a dummy cpu_alpha (again).
20050714	More work on cpu_alpha.
20050715	More work on cpu_alpha. Many instructions work, enough to run
		a simple framebuffer fill test (similar to the ARM test).
20050716	More Alpha stuff.
20050717	Minor updates (Alpha stuff).
20050718	Minor updates (Alpha stuff).
20050719	Generalizing some Alpha instructions.
20050720	More Alpha-related updates.
20050721	Continuing on cpu_alpha. Importing rpb.h from NetBSD/alpha.
20050722	Alpha-related updates: userland stuff (Hello World using
		write() compiled statically for FreeBSD/Alpha runs fine), and
		more instructions are now implemented.
20050723	Fixing ldq_u and stq_u.
		Adding more instructions (conditional moves, masks, extracts,
		shifts).
20050724	More FreeBSD/Alpha userland stuff, and adding some more
		instructions (inserts).
20050725	Continuing on the Alpha stuff. (Adding dummy ldt/stt.)
		Adding a -A command line option to turn off alignment checks
		in some cases (for translated code).
		Trying to remove the old bintrans code which updated the pc
		and nr_of_executed_instructions for every instruction.
20050726	Making another attempt att removing the pc/nr of instructions
		code. This time it worked, huge performance increase for
		artificial test code, but performance loss for real-world
		code :-( so I'm scrapping that code for now.
		Tiny performance increase on Alpha (by using ret instead of
		jmp, to play nice with the Alpha's branch prediction) for the
		old MIPS bintrans backend.
20050727	Various minor fixes and cleanups.
20050728	Switching from a 2-level virtual to host/physical translation
		system for ARM emulation, to a 1-level translation.
		Trying to switch from 2-level to 1-level for the MIPS bintrans
		system as well (Alpha only, so far), but there is at least one
		problem: caches and/or how they work with device mappings.
20050730	Doing the 2-level to 1-level conversion for the i386 backend.
		The cache/device bug is still there for R2K/3K :(
		Various other minor updates (Malta etc).
		The mc146818 clock now updates the UIP bit in a way which works
		better with Linux for at least sgimips and Malta emulation.
		Beginning the work on refactoring the dyntrans system.
20050731	Continuing the dyntrans refactoring.
		Fixing a small but serious host alignment bug in memory_rw.
		Adding support for big-endian load/stores to the i386 bintrans
		backend.
		Another minor i386 bintrans backend update: stores from the
		zero register are now one (or two) loads shorter.
		The slt and sltu instructions were incorrectly implemented for
		the i386 backend; only using them for 32-bit mode for now.
20050801	Continuing the dyntrans refactoring.
		Cleanup of the ns16550 serial controller (removing unnecessary
		code).
		Bugfix (memory corruption bug) in dev_gt, and a patch/hack from
		Alec Voropay for Linux/Malta.
20050802	More cleanup/refactoring of the dyntrans subsystem: adding
		phys_page pointers to the lookup tables, for quick jumps
		between translated pages.
		Better fix for the ns16550 device (but still no real FIFO
		functionality).
		Converting cpu_ppc to the new dyntrans system. This means that
		I will have to start from scratch with implementing each
		instruction, and figure out how to implement dual 64/32-bit
		modes etc.
		Removing the URISC CPU family, because it was useless.
20050803	When selecting a machine type, the main type can now be omitted
		if the subtype name is unique. (I.e. -E can be omitted.)
		Fixing a dyntrans/device update bug. (Writes to offset 0 of
		a device could sometimes go unnoticed.)
		Adding an experimental "instruction combination" hack for
		ARM for memset-like byte fill loops.
20050804	Minor progress on cpu_alpha and related things.
		Finally fixing the MIPS dmult/dmultu bugs.
		Fixing some minor TODOs.
20050805	Generalizing the 8259 PIC. It now also works with Cobalt
		and evbmips emulation, in addition to the x86 hack.
		Finally converting the ns16550 device to use devinit.
		Continuing the work on the dyntrans system. Thinking about
		how to add breakpoints.
20050806	More dyntrans updates. Breakpoints seem to work now.
20050807	Minor updates: cpu_alpha and related things; removing
		dev_malta (as it isn't used any more).
		Dyntrans: working on general "show trace tree" support.
		The trace tree stuff now works with both the old MIPS code and
		with newer dyntrans modes. :)
		Continuing on Alpha-related stuff (trying to get *BSD to boot
		a bit further, adding more instructions, etc).
20050808	Adding a dummy IA64 cpu family, and continuing the refactoring
		of the dyntrans system.
		Removing the regression test stuff, because it was more or
		less useless.
		Adding loadlinked/storeconditional type instructions to the
		Alpha emulation. (Needed for Linux/alpha. Not very well tested
		yet.)
20050809	The function call trace tree now prints a per-function nr of
		arguments. (Semi-meaningless, since that data isn't read yet
		from the ELFs; some hardcoded symbols such as memcpy() and
		strlen() work fine, though.)
		More dyntrans refactoring; taking out more of the things that
		are common to all cpu families.
20050810	Working on adding support for "dual mode" for PPC dyntrans
		(i.e. both 64-bit and 32-bit modes).
		(Re)adding some simple PPC instructions.
20050811	Adding a dummy M68K cpu family. The dyntrans system isn't ready
		for variable-length ISAs yet, so it's completely bogus so far.
		Re-adding more PPC instructions.
		Adding a hack to src/file.c which allows OpenBSD/mac68k a.out
		kernels to be loaded.
		Beginning to add PPC loads/stores. So far they only work in
		32-bit mode.
20050812	The configure file option "add_remote" now accepts symbolic
		host names, in addition to numeric IPv4 addresses.
		Re-adding more PPC instructions.
20050814	Continuing to port back more PPC instructions.
		Found and fixed the cache/device write-update bug for 32-bit
		MIPS bintrans. :-)
		Triggered a really weird and annoying bug in Compaq's C
		compiler; ccc sometimes outputs code which loads from an
		address _before_ checking whether the pointer was NULL or not.
		(I'm not sure how to handle this problem.)
20050815	Removing all of the old x86 instruction execution code; adding
		a new (dummy) dyntrans module for x86.
		Taking the first steps to extend the dyntrans system to support
		variable-length instructions.
		Slowly preparing for the next release.
20050816	Adding a dummy SPARC cpu module.
		Minor updates (documentation etc) for the release.

==============  RELEASE 0.3.5  ==============


1 dpavlin 2 /*
2     * Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
3     *
4     * Redistribution and use in source and binary forms, with or without
5     * modification, are permitted provided that the following conditions are met:
6     *
7     * 1. Redistributions of source code must retain the above copyright
8     * notice, this list of conditions and the following disclaimer.
9     * 2. Redistributions in binary form must reproduce the above copyright
10     * notice, this list of conditions and the following disclaimer in the
11     * documentation and/or other materials provided with the distribution.
12     * 3. The name of the author may not be used to endorse or promote products
13     * derived from this software without specific prior written permission.
14     *
15     * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
16     * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17     * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18     * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19     * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20     * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21     * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22     * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24     * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25     * SUCH DAMAGE.
26     *
27     *
28 dpavlin 12 * $Id: bintrans_i386.c,v 1.83 2005/07/31 08:47:57 debug Exp $
29 dpavlin 2 *
30     * i386 specific code for dynamic binary translation.
31     * See bintrans.c for more information. Included from bintrans.c.
32     *
33     * Translated code uses the following conventions at all time:
34     *
35     * esi points to the cpu struct
36     * edi lowest 32 bits of cpu->pc
37     * ebp contains cpu->bintrans_instructions_executed
38     */
39    
40    
41     struct cpu dummy_cpu;
42     struct mips_coproc dummy_coproc;
43     struct vth32_table dummy_vth32_table;
44    
45    
46     /*
47     * bintrans_host_cacheinvalidate()
48     *
49     * Invalidate the host's instruction cache. On i386, this isn't necessary,
50     * so this is an empty function.
51     */
52     static void bintrans_host_cacheinvalidate(unsigned char *p, size_t len)
53     {
54     /* Do nothing. */
55     }
56    
57    
58     /* offsetof (in stdarg.h) could possibly be used, but I'm not sure
59     if it will take care of the compiler problems... */
60    
61     #define ofs_i (((size_t)&dummy_cpu.cd.mips.bintrans_instructions_executed) - ((size_t)&dummy_cpu))
62     #define ofs_pc (((size_t)&dummy_cpu.pc) - ((size_t)&dummy_cpu))
63     #define ofs_pc_last (((size_t)&dummy_cpu.cd.mips.pc_last) - ((size_t)&dummy_cpu))
64     #define ofs_tabl0 (((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - ((size_t)&dummy_cpu))
65     #define ofs_chunks ((size_t)&dummy_vth32_table.bintrans_chunks[0] - (size_t)&dummy_vth32_table)
66     #define ofs_chunkbase ((size_t)&dummy_cpu.cd.mips.chunk_base_address - (size_t)&dummy_cpu)
67 dpavlin 12 #define ofs_h_l (((size_t)&dummy_cpu.cd.mips.host_load) - ((size_t)&dummy_cpu))
68     #define ofs_h_s (((size_t)&dummy_cpu.cd.mips.host_store) - ((size_t)&dummy_cpu))
69 dpavlin 2
70    
71     static void (*bintrans_runchunk)(struct cpu *, unsigned char *);
72     static void (*bintrans_jump_to_32bit_pc)(struct cpu *);
73 dpavlin 10 static void (*bintrans_load_32bit)(struct cpu *);
74     static void (*bintrans_store_32bit)(struct cpu *);
75 dpavlin 2
76    
77     /*
78     * bintrans_write_quickjump():
79     */
80     static void bintrans_write_quickjump(struct memory *mem,
81     unsigned char *quickjump_code, uint32_t chunkoffset)
82     {
83     uint32_t i386_addr;
84     unsigned char *a = quickjump_code;
85    
86     i386_addr = chunkoffset + (size_t)mem->translation_code_chunk_space;
87     i386_addr = i386_addr - ((size_t)a + 5);
88    
89     /* printf("chunkoffset=%i, %08x %08x %i\n",
90     chunkoffset, i386_addr, a, ofs); */
91    
92     *a++ = 0xe9;
93     *a++ = i386_addr;
94     *a++ = i386_addr >> 8;
95     *a++ = i386_addr >> 16;
96     *a++ = i386_addr >> 24;
97     }
98    
99    
100     /*
101     * bintrans_write_chunkreturn():
102     */
103     static void bintrans_write_chunkreturn(unsigned char **addrp)
104     {
105     unsigned char *a = *addrp;
106     *a++ = 0xc3; /* ret */
107     *addrp = a;
108     }
109    
110    
111     /*
112     * bintrans_write_chunkreturn_fail():
113     */
114     static void bintrans_write_chunkreturn_fail(unsigned char **addrp)
115     {
116     unsigned char *a = *addrp;
117    
118     /* 81 cd 00 00 00 01 orl $0x1000000,%ebp */
119     *a++ = 0x81; *a++ = 0xcd;
120     *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x01; /* TODO: not hardcoded */
121    
122     *a++ = 0xc3; /* ret */
123     *addrp = a;
124     }
125    
126    
127     /*
128     * bintrans_write_pc_inc():
129     */
130     static void bintrans_write_pc_inc(unsigned char **addrp)
131     {
132     unsigned char *a = *addrp;
133    
134     /* 83 c7 04 add $0x4,%edi */
135     *a++ = 0x83; *a++ = 0xc7; *a++ = 4;
136    
137     #if 0
138     if (!bintrans_32bit_only) {
139     int ofs;
140     /* 83 96 zz zz zz zz 00 adcl $0x0,zz(%esi) */
141     ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
142     ofs += 4;
143     *a++ = 0x83; *a++ = 0x96;
144     *a++ = ofs & 255;
145     *a++ = (ofs >> 8) & 255;
146     *a++ = (ofs >> 16) & 255;
147     *a++ = (ofs >> 24) & 255;
148     *a++ = 0;
149     }
150     #endif
151    
152     /* 45 inc %ebp */
153     *a++ = 0x45;
154    
155     *addrp = a;
156     }
157    
158    
159     /*
160     * load_pc_into_eax_edx():
161     */
162     static void load_pc_into_eax_edx(unsigned char **addrp)
163     {
164     unsigned char *a;
165     a = *addrp;
166    
167     /* 89 f8 mov %edi,%eax */
168     *a++ = 0x89; *a++ = 0xf8;
169    
170     #if 0
171     if (bintrans_32bit_only) {
172     /* 99 cltd */
173     *a++ = 0x99;
174     } else
175     #endif
176     {
177     int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
178     /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
179     ofs += 4;
180     *a++ = 0x8b; *a++ = 0x96;
181     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
182     }
183    
184     *addrp = a;
185     }
186    
187    
188     /*
189     * store_eax_edx_into_pc():
190     */
191     static void store_eax_edx_into_pc(unsigned char **addrp)
192     {
193     unsigned char *a;
194     int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
195     a = *addrp;
196    
197     /* 89 c7 mov %eax,%edi */
198     *a++ = 0x89; *a++ = 0xc7;
199    
200     /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
201     ofs += 4;
202     *a++ = 0x89; *a++ = 0x96;
203     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
204    
205     *addrp = a;
206     }
207    
208    
209     /*
210     * load_into_eax_edx():
211     *
212     * Usage: load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
213     */
214     static void load_into_eax_edx(unsigned char **addrp, void *p)
215     {
216     unsigned char *a;
217     int ofs = (size_t)p - (size_t)&dummy_cpu;
218     a = *addrp;
219    
220     /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
221     *a++ = 0x8b; *a++ = 0x86;
222     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
223    
224     #if 0
225     if (bintrans_32bit_only) {
226     /* 99 cltd */
227     *a++ = 0x99;
228     } else
229     #endif
230     {
231     /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
232     ofs += 4;
233     *a++ = 0x8b; *a++ = 0x96;
234     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
235     }
236    
237     *addrp = a;
238     }
239    
240    
241     /*
242     * load_into_eax_and_sign_extend_into_edx():
243     *
244     * Usage: load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
245     */
246     static void load_into_eax_and_sign_extend_into_edx(unsigned char **addrp, void *p)
247     {
248     unsigned char *a;
249     int ofs = (size_t)p - (size_t)&dummy_cpu;
250     a = *addrp;
251    
252     /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
253     *a++ = 0x8b; *a++ = 0x86;
254     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
255    
256     /* 99 cltd */
257     *a++ = 0x99;
258    
259     *addrp = a;
260     }
261    
262    
263     /*
264     * load_into_eax_dont_care_about_edx():
265     *
266     * Usage: load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
267     */
268     static void load_into_eax_dont_care_about_edx(unsigned char **addrp, void *p)
269     {
270     unsigned char *a;
271     int ofs = (size_t)p - (size_t)&dummy_cpu;
272     a = *addrp;
273    
274     /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
275     *a++ = 0x8b; *a++ = 0x86;
276     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
277    
278     *addrp = a;
279     }
280    
281    
282     /*
283     * store_eax_edx():
284     *
285     * Usage: store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
286     */
287     static void store_eax_edx(unsigned char **addrp, void *p)
288     {
289     unsigned char *a;
290     int ofs = (size_t)p - (size_t)&dummy_cpu;
291     a = *addrp;
292    
293     /* 89 86 38 30 00 00 mov %eax,0x3038(%esi) */
294     *a++ = 0x89; *a++ = 0x86;
295     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
296    
297     /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
298     ofs += 4;
299     *a++ = 0x89; *a++ = 0x96;
300     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
301    
302     *addrp = a;
303     }
304    
305    
306     /*
307     * bintrans_write_instruction__lui():
308     */
309     static int bintrans_write_instruction__lui(unsigned char **addrp, int rt, int imm)
310     {
311     unsigned char *a;
312    
313     a = *addrp;
314     if (rt == 0)
315     goto rt0;
316    
317     /* b8 00 00 dc fe mov $0xfedc0000,%eax */
318     *a++ = 0xb8; *a++ = 0; *a++ = 0;
319     *a++ = imm & 255; *a++ = imm >> 8;
320    
321     /* 99 cltd */
322     *a++ = 0x99;
323    
324     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
325     *addrp = a;
326    
327     rt0:
328     bintrans_write_pc_inc(addrp);
329     return 1;
330     }
331    
332    
333     /*
334     * bintrans_write_instruction__jr():
335     */
336     static int bintrans_write_instruction__jr(unsigned char **addrp,
337     int rs, int rd, int special)
338     {
339     unsigned char *a;
340     int ofs;
341    
342     a = *addrp;
343    
344     /*
345     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
346     * and cpu->delay_jmpaddr = gpr[rs].
347     */
348    
349     /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
350     ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
351     *a++ = 0xc7; *a++ = 0x86;
352     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
353     *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
354    
355     #if 0
356     if (bintrans_32bit_only)
357     load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
358     else
359     #endif
360     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
361    
362     store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
363    
364     if (special == SPECIAL_JALR && rd != 0) {
365     /* gpr[rd] = retaddr (pc + 8) */
366    
367     #if 0
368     if (bintrans_32bit_only) {
369     load_pc_into_eax_edx(&a);
370     /* 83 c0 08 add $0x8,%eax */
371     *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
372     } else
373     #endif
374     {
375     load_pc_into_eax_edx(&a);
376     /* 83 c0 08 add $0x8,%eax */
377     /* 83 d2 00 adc $0x0,%edx */
378     *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
379     *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
380     }
381    
382     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
383     }
384    
385     *addrp = a;
386     bintrans_write_pc_inc(addrp);
387     return 1;
388     }
389    
390    
391     /*
392     * bintrans_write_instruction__mfmthilo():
393     */
394     static int bintrans_write_instruction__mfmthilo(unsigned char **addrp,
395     int rd, int from_flag, int hi_flag)
396     {
397     unsigned char *a;
398    
399     a = *addrp;
400    
401     if (from_flag) {
402     if (rd != 0) {
403     /* mfhi or mflo */
404     if (hi_flag)
405     load_into_eax_edx(&a, &dummy_cpu.cd.mips.hi);
406     else
407     load_into_eax_edx(&a, &dummy_cpu.cd.mips.lo);
408     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
409     }
410     } else {
411     /* mthi or mtlo */
412     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
413     if (hi_flag)
414     store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
415     else
416     store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
417     }
418    
419     *addrp = a;
420     bintrans_write_pc_inc(addrp);
421     return 1;
422     }
423    
424    
425     /*
426     * bintrans_write_instruction__addiu_etc():
427     */
428 dpavlin 12 static int bintrans_write_instruction__addiu_etc(
429     struct memory *mem, unsigned char **addrp,
430 dpavlin 2 int rt, int rs, int imm, int instruction_type)
431     {
432     unsigned char *a;
433     unsigned int uimm;
434    
435     /* TODO: overflow detection for ADDI and DADDI */
436     switch (instruction_type) {
437     case HI6_ADDI:
438     case HI6_DADDI:
439     return 0;
440     }
441    
442     a = *addrp;
443    
444     if (rt == 0)
445     goto rt0;
446    
447     uimm = imm & 0xffff;
448    
449     if (uimm == 0 && (instruction_type == HI6_ADDIU ||
450     instruction_type == HI6_ADDI)) {
451     load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
452     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
453     goto rt0;
454     }
455    
456     if (uimm == 0 && (instruction_type == HI6_DADDIU ||
457     instruction_type == HI6_DADDI || instruction_type == HI6_ORI)) {
458     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
459     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
460     goto rt0;
461     }
462    
463 dpavlin 12 if (mem->bintrans_32bit_only)
464 dpavlin 2 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
465     else
466     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
467    
468     switch (instruction_type) {
469     case HI6_ADDIU:
470     case HI6_DADDIU:
471     case HI6_ADDI:
472     case HI6_DADDI:
473     if (imm & 0x8000) {
474     /* 05 39 fd ff ff add $0xfffffd39,%eax */
475     /* 83 d2 ff adc $0xffffffff,%edx */
476     *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0xff; *a++ = 0xff;
477     if (instruction_type == HI6_DADDIU) {
478     *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
479     }
480     } else {
481     /* 05 c7 02 00 00 add $0x2c7,%eax */
482     /* 83 d2 00 adc $0x0,%edx */
483     *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
484     if (instruction_type == HI6_DADDIU) {
485     *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
486     }
487     }
488     if (instruction_type == HI6_ADDIU) {
489     /* 99 cltd */
490     *a++ = 0x99;
491     }
492     break;
493     case HI6_ANDI:
494     /* 25 34 12 00 00 and $0x1234,%eax */
495     /* 31 d2 xor %edx,%edx */
496     *a++ = 0x25; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
497     *a++ = 0x31; *a++ = 0xd2;
498     break;
499     case HI6_ORI:
500     /* 0d 34 12 00 00 or $0x1234,%eax */
501     *a++ = 0xd; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
502     break;
503     case HI6_XORI:
504     /* 35 34 12 00 00 xor $0x1234,%eax */
505     *a++ = 0x35; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
506     break;
507     case HI6_SLTIU:
508     /* set if less than, unsigned. (compare edx:eax to ecx:ebx) */
509     /* ecx:ebx = the immediate value */
510     /* bb dc fe ff ff mov $0xfffffedc,%ebx */
511     /* b9 ff ff ff ff mov $0xffffffff,%ecx */
512     /* or */
513     /* 29 c9 sub %ecx,%ecx */
514     #if 0
515     if (bintrans_32bit_only) {
516     /* 99 cltd */
517     *a++ = 0x99;
518     }
519     #endif
520     *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
521     if (uimm & 0x8000) {
522     *a++ = 0xff; *a++ = 0xff;
523     *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
524     } else {
525     *a++ = 0; *a++ = 0;
526     *a++ = 0x29; *a++ = 0xc9;
527     }
528    
529     /* if edx <= ecx and eax < ebx then 1, else 0. */
530     /* 39 ca cmp %ecx,%edx */
531     /* 77 0b ja <ret0> */
532     /* 39 d8 cmp %ebx,%eax */
533     /* 73 07 jae 58 <ret0> */
534     *a++ = 0x39; *a++ = 0xca;
535     *a++ = 0x77; *a++ = 0x0b;
536     *a++ = 0x39; *a++ = 0xd8;
537     *a++ = 0x73; *a++ = 0x07;
538    
539     /* b8 01 00 00 00 mov $0x1,%eax */
540     /* eb 02 jmp <common> */
541     *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
542     *a++ = 0xeb; *a++ = 0x02;
543    
544     /* ret0: */
545     /* 29 c0 sub %eax,%eax */
546     *a++ = 0x29; *a++ = 0xc0;
547    
548     /* common: */
549     /* 99 cltd */
550     *a++ = 0x99;
551     break;
552     case HI6_SLTI:
553     /* set if less than, signed. (compare edx:eax to ecx:ebx) */
554     /* ecx:ebx = the immediate value */
555     /* bb dc fe ff ff mov $0xfffffedc,%ebx */
556     /* b9 ff ff ff ff mov $0xffffffff,%ecx */
557     /* or */
558     /* 29 c9 sub %ecx,%ecx */
559     #if 0
560     if (bintrans_32bit_only) {
561     /* 99 cltd */
562     *a++ = 0x99;
563     }
564     #endif
565     *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
566     if (uimm & 0x8000) {
567     *a++ = 0xff; *a++ = 0xff;
568     *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
569     } else {
570     *a++ = 0; *a++ = 0;
571     *a++ = 0x29; *a++ = 0xc9;
572     }
573    
574     /* if edx > ecx then 0. */
575     /* if edx < ecx then 1. */
576     /* if eax < ebx then 1, else 0. */
577     /* 39 ca cmp %ecx,%edx */
578     /* 7c 0a jl <ret1> */
579     /* 7f 04 jg <ret0> */
580     /* 39 d8 cmp %ebx,%eax */
581     /* 7c 04 jl <ret1> */
582     *a++ = 0x39; *a++ = 0xca;
583     *a++ = 0x7c; *a++ = 0x0a;
584     *a++ = 0x7f; *a++ = 0x04;
585     *a++ = 0x39; *a++ = 0xd8;
586     *a++ = 0x7c; *a++ = 0x04;
587    
588     /* ret0: */
589     /* 29 c0 sub %eax,%eax */
590     /* eb 05 jmp <common> */
591     *a++ = 0x29; *a++ = 0xc0;
592     *a++ = 0xeb; *a++ = 0x05;
593    
594     /* ret1: */
595     /* b8 01 00 00 00 mov $0x1,%eax */
596     *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
597    
598     /* common: */
599     /* 99 cltd */
600     *a++ = 0x99;
601     break;
602     }
603    
604     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
605    
606     rt0:
607     *addrp = a;
608     bintrans_write_pc_inc(addrp);
609     return 1;
610     }
611    
612    
613     /*
614     * bintrans_write_instruction__jal():
615     */
616     static int bintrans_write_instruction__jal(unsigned char **addrp,
617     int imm, int link)
618     {
619     unsigned char *a;
620     uint32_t subimm;
621     int ofs;
622    
623     a = *addrp;
624    
625     load_pc_into_eax_edx(&a);
626    
627     if (link) {
628     /* gpr[31] = pc + 8 */
629     #if 0
630     if (bintrans_32bit_only) {
631     /* 50 push %eax */
632     /* 83 c0 08 add $0x8,%eax */
633     *a++ = 0x50;
634     *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
635     } else
636     #endif
637     {
638     /* 50 push %eax */
639     /* 52 push %edx */
640     /* 83 c0 08 add $0x8,%eax */
641     /* 83 d2 00 adc $0x0,%edx */
642     *a++ = 0x50;
643     *a++ = 0x52;
644     *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
645     *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
646     }
647     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[31]);
648     #if 0
649     if (bintrans_32bit_only) {
650     /* 58 pop %eax */
651     *a++ = 0x58;
652     } else
653     #endif
654     {
655     /* 5a pop %edx */
656     /* 58 pop %eax */
657     *a++ = 0x5a;
658     *a++ = 0x58;
659     }
660     }
661    
662     /* delay_jmpaddr = top 36 bits of pc together with lowest 28 bits of imm*4: */
663     imm *= 4;
664    
665     /* Add 4, because the jump is from the delay slot: */
666     /* 83 c0 04 add $0x4,%eax */
667     /* 83 d2 00 adc $0x0,%edx */
668     *a++ = 0x83; *a++ = 0xc0; *a++ = 0x04;
669     *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
670    
671     /* c1 e8 1c shr $0x1c,%eax */
672     /* c1 e0 1c shl $0x1c,%eax */
673     *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x1c;
674     *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x1c;
675    
676     subimm = imm;
677     subimm &= 0x0fffffff;
678    
679     /* 0d 78 56 34 12 or $0x12345678,%eax */
680     *a++ = 0x0d; *a++ = subimm; *a++ = subimm >> 8;
681     *a++ = subimm >> 16; *a++ = subimm >> 24;
682    
683     store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
684    
685     /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
686     ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
687     *a++ = 0xc7; *a++ = 0x86;
688     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
689     *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
690    
691     *addrp = a;
692     bintrans_write_pc_inc(addrp);
693     return 1;
694     }
695    
696    
697     /*
698     * bintrans_write_instruction__addu_etc():
699     */
700 dpavlin 12 static int bintrans_write_instruction__addu_etc(
701     struct memory *mem, unsigned char **addrp,
702 dpavlin 2 int rd, int rs, int rt, int sa, int instruction_type)
703     {
704     unsigned char *a;
705     int load64 = 0, do_store = 1;
706    
707     /* TODO: Not yet */
708     switch (instruction_type) {
709     case SPECIAL_MULT:
710     case SPECIAL_MULTU:
711     case SPECIAL_DIV:
712     case SPECIAL_DIVU:
713     if (rd != 0)
714     return 0;
715     break;
716     case SPECIAL_DSLL:
717     case SPECIAL_DSLL32:
718     case SPECIAL_DSRA:
719     case SPECIAL_DSRA32:
720     case SPECIAL_DSRL:
721     case SPECIAL_DSRL32:
722     case SPECIAL_MOVZ:
723     case SPECIAL_MOVN:
724     bintrans_write_chunkreturn_fail(addrp);
725     return 0;
726 dpavlin 12 case SPECIAL_SLT:
727     case SPECIAL_SLTU:
728     if (!mem->bintrans_32bit_only) {
729     bintrans_write_chunkreturn_fail(addrp);
730     return 0;
731     }
732     break;
733 dpavlin 2 }
734    
735     switch (instruction_type) {
736     case SPECIAL_DADDU:
737     case SPECIAL_DSUBU:
738     case SPECIAL_OR:
739     case SPECIAL_AND:
740     case SPECIAL_NOR:
741     case SPECIAL_XOR:
742     case SPECIAL_DSLL:
743     case SPECIAL_DSRL:
744     case SPECIAL_DSRA:
745     case SPECIAL_DSLL32:
746     case SPECIAL_DSRL32:
747     case SPECIAL_DSRA32:
748     load64 = 1;
749     }
750    
751     switch (instruction_type) {
752     case SPECIAL_MULT:
753     case SPECIAL_MULTU:
754     case SPECIAL_DIV:
755     case SPECIAL_DIVU:
756     break;
757     default:
758     if (rd == 0)
759     goto rd0;
760     }
761    
762     a = *addrp;
763    
764     if ((instruction_type == SPECIAL_ADDU || instruction_type == SPECIAL_DADDU
765     || instruction_type == SPECIAL_OR) && rt == 0) {
766     if (load64)
767     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
768     else
769     load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
770     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
771     *addrp = a;
772     goto rd0;
773     }
774    
775     /* edx:eax = rs, ecx:ebx = rt */
776 dpavlin 12 if (load64 && !mem->bintrans_32bit_only) {
777 dpavlin 2 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
778     /* 89 c3 mov %eax,%ebx */
779     /* 89 d1 mov %edx,%ecx */
780     *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
781     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
782     } else {
783     load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
784     /* 89 c3 mov %eax,%ebx */
785     /* 89 d1 mov %edx,%ecx */
786     *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
787     load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
788     }
789    
790     switch (instruction_type) {
791     case SPECIAL_ADDU:
792     /* 01 d8 add %ebx,%eax */
793     /* 99 cltd */
794     *a++ = 0x01; *a++ = 0xd8;
795     *a++ = 0x99;
796     break;
797     case SPECIAL_DADDU:
798     /* 01 d8 add %ebx,%eax */
799     /* 11 ca adc %ecx,%edx */
800     *a++ = 0x01; *a++ = 0xd8;
801     *a++ = 0x11; *a++ = 0xca;
802     break;
803     case SPECIAL_SUBU:
804     /* 29 d8 sub %ebx,%eax */
805     /* 99 cltd */
806     *a++ = 0x29; *a++ = 0xd8;
807     *a++ = 0x99;
808     break;
809     case SPECIAL_DSUBU:
810     /* 29 d8 sub %ebx,%eax */
811     /* 19 ca sbb %ecx,%edx */
812     *a++ = 0x29; *a++ = 0xd8;
813     *a++ = 0x19; *a++ = 0xca;
814     break;
815     case SPECIAL_AND:
816     /* 21 d8 and %ebx,%eax */
817     /* 21 ca and %ecx,%edx */
818     *a++ = 0x21; *a++ = 0xd8;
819     *a++ = 0x21; *a++ = 0xca;
820     break;
821     case SPECIAL_OR:
822     /* 09 d8 or %ebx,%eax */
823     /* 09 ca or %ecx,%edx */
824     *a++ = 0x09; *a++ = 0xd8;
825     *a++ = 0x09; *a++ = 0xca;
826     break;
827     case SPECIAL_NOR:
828     /* 09 d8 or %ebx,%eax */
829     /* 09 ca or %ecx,%edx */
830     /* f7 d0 not %eax */
831     /* f7 d2 not %edx */
832     *a++ = 0x09; *a++ = 0xd8;
833     *a++ = 0x09; *a++ = 0xca;
834     *a++ = 0xf7; *a++ = 0xd0;
835     *a++ = 0xf7; *a++ = 0xd2;
836     break;
837     case SPECIAL_XOR:
838     /* 31 d8 xor %ebx,%eax */
839     /* 31 ca xor %ecx,%edx */
840     *a++ = 0x31; *a++ = 0xd8;
841     *a++ = 0x31; *a++ = 0xca;
842     break;
843     case SPECIAL_SLL:
844     /* 89 d8 mov %ebx,%eax */
845     /* c1 e0 1f shl $0x1f,%eax */
846     /* 99 cltd */
847     *a++ = 0x89; *a++ = 0xd8;
848     if (sa == 1) {
849     *a++ = 0xd1; *a++ = 0xe0;
850     } else {
851     *a++ = 0xc1; *a++ = 0xe0; *a++ = sa;
852     }
853     *a++ = 0x99;
854     break;
855     case SPECIAL_SRA:
856     /* 89 d8 mov %ebx,%eax */
857     /* c1 f8 1f sar $0x1f,%eax */
858     /* 99 cltd */
859     *a++ = 0x89; *a++ = 0xd8;
860     if (sa == 1) {
861     *a++ = 0xd1; *a++ = 0xf8;
862     } else {
863     *a++ = 0xc1; *a++ = 0xf8; *a++ = sa;
864     }
865     *a++ = 0x99;
866     break;
867     case SPECIAL_SRL:
868     /* 89 d8 mov %ebx,%eax */
869     /* c1 e8 1f shr $0x1f,%eax */
870     /* 99 cltd */
871     *a++ = 0x89; *a++ = 0xd8;
872     if (sa == 1) {
873     *a++ = 0xd1; *a++ = 0xe8;
874     } else {
875     *a++ = 0xc1; *a++ = 0xe8; *a++ = sa;
876     }
877     *a++ = 0x99;
878     break;
879     case SPECIAL_SLTU:
880 dpavlin 12 /* NOTE: 32-bit ONLY! */
881     /* set if less than, unsigned. (compare eax to ebx) */
882 dpavlin 2 /* 39 d8 cmp %ebx,%eax */
883     /* 73 07 jae 58 <ret0> */
884     *a++ = 0x39; *a++ = 0xd8;
885     *a++ = 0x73; *a++ = 0x07;
886    
887     /* b8 01 00 00 00 mov $0x1,%eax */
888     /* eb 02 jmp <common> */
889     *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
890     *a++ = 0xeb; *a++ = 0x02;
891    
892     /* ret0: */
893     /* 29 c0 sub %eax,%eax */
894     *a++ = 0x29; *a++ = 0xc0;
895    
896     /* common: */
897     /* 99 cltd */
898     *a++ = 0x99;
899     break;
900     case SPECIAL_SLT:
901 dpavlin 12 /* NOTE: 32-bit ONLY! */
902     /* set if less than, signed. (compare eax to ebx) */
903 dpavlin 2 /* 39 d8 cmp %ebx,%eax */
904     /* 7c 04 jl <ret1> */
905     *a++ = 0x39; *a++ = 0xd8;
906     *a++ = 0x7c; *a++ = 0x04;
907    
908     /* ret0: */
909     /* 29 c0 sub %eax,%eax */
910     /* eb 05 jmp <common> */
911     *a++ = 0x29; *a++ = 0xc0;
912     *a++ = 0xeb; *a++ = 0x05;
913    
914     /* ret1: */
915     /* b8 01 00 00 00 mov $0x1,%eax */
916     *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
917    
918     /* common: */
919     /* 99 cltd */
920     *a++ = 0x99;
921     break;
922     case SPECIAL_SLLV:
923     /* rd = rt << (rs&31) (logical) eax = ebx << (eax&31) */
924     /* xchg ebx,eax, then we can do eax = eax << (ebx&31) */
925     /* 93 xchg %eax,%ebx */
926     /* 89 d9 mov %ebx,%ecx */
927     /* 83 e1 1f and $0x1f,%ecx */
928     /* d3 e0 shl %cl,%eax */
929     *a++ = 0x93;
930     *a++ = 0x89; *a++ = 0xd9;
931     *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
932     *a++ = 0xd3; *a++ = 0xe0;
933     /* 99 cltd */
934     *a++ = 0x99;
935     break;
936     case SPECIAL_SRLV:
937     /* rd = rt >> (rs&31) (logical) eax = ebx >> (eax&31) */
938     /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
939     /* 93 xchg %eax,%ebx */
940     /* 89 d9 mov %ebx,%ecx */
941     /* 83 e1 1f and $0x1f,%ecx */
942     /* d3 e8 shr %cl,%eax */
943     *a++ = 0x93;
944     *a++ = 0x89; *a++ = 0xd9;
945     *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
946     *a++ = 0xd3; *a++ = 0xe8;
947     /* 99 cltd */
948     *a++ = 0x99;
949     break;
950     case SPECIAL_SRAV:
951     /* rd = rt >> (rs&31) (arithmetic) eax = ebx >> (eax&31) */
952     /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
953     /* 93 xchg %eax,%ebx */
954     /* 89 d9 mov %ebx,%ecx */
955     /* 83 e1 1f and $0x1f,%ecx */
956     /* d3 f8 sar %cl,%eax */
957     *a++ = 0x93;
958     *a++ = 0x89; *a++ = 0xd9;
959     *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
960     *a++ = 0xd3; *a++ = 0xf8;
961     /* 99 cltd */
962     *a++ = 0x99;
963     break;
964     case SPECIAL_MULT:
965     case SPECIAL_MULTU:
966     /* 57 push %edi */
967     *a++ = 0x57;
968     if (instruction_type == SPECIAL_MULT) {
969     /* f7 eb imul %ebx */
970     *a++ = 0xf7; *a++ = 0xeb;
971     } else {
972     /* f7 e3 mul %ebx */
973     *a++ = 0xf7; *a++ = 0xe3;
974     }
975     /* here: edx:eax = hi:lo */
976     /* 89 d7 mov %edx,%edi */
977     /* 99 cltd */
978     *a++ = 0x89; *a++ = 0xd7;
979     *a++ = 0x99;
980     /* here: edi=hi, edx:eax = sign-extended lo */
981     store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
982     /* 89 f8 mov %edi,%eax */
983     /* 99 cltd */
984     *a++ = 0x89; *a++ = 0xf8;
985     *a++ = 0x99;
986     /* here: edx:eax = sign-extended hi */
987     store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
988     /* 5f pop %edi */
989     *a++ = 0x5f;
990     do_store = 0;
991     break;
992     case SPECIAL_DIV:
993     case SPECIAL_DIVU:
994     /*
995     * In: edx:eax = rs, ecx:ebx = rt
996     * Out: LO = rs / rt, HI = rs % rt
997     */
998     /* Division by zero on MIPS is undefined, but on
999     i386 it causes an exception, so we'll try to
1000     avoid that. */
1001     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00; /* cmp $0x0,%ebx */
1002     *a++ = 0x75; *a++ = 0x01; /* jne skip_inc */
1003     *a++ = 0x43; /* inc %ebx */
1004    
1005     /* 57 push %edi */
1006     *a++ = 0x57;
1007     if (instruction_type == SPECIAL_DIV) {
1008     *a++ = 0x99; /* cltd */
1009     *a++ = 0xf7; *a++ = 0xfb; /* idiv %ebx */
1010     } else {
1011     *a++ = 0x29; *a++ = 0xd2; /* sub %edx,%edx */
1012     *a++ = 0xf7; *a++ = 0xf3; /* div %ebx */
1013     }
1014     /* here: edx:eax = hi:lo */
1015     /* 89 d7 mov %edx,%edi */
1016     /* 99 cltd */
1017     *a++ = 0x89; *a++ = 0xd7;
1018     *a++ = 0x99;
1019     /* here: edi=hi, edx:eax = sign-extended lo */
1020     store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
1021     /* 89 f8 mov %edi,%eax */
1022     /* 99 cltd */
1023     *a++ = 0x89; *a++ = 0xf8;
1024     *a++ = 0x99;
1025     /* here: edx:eax = sign-extended hi */
1026     store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
1027     /* 5f pop %edi */
1028     *a++ = 0x5f;
1029     do_store = 0;
1030     break;
1031     #if 0
1032     /* TODO: These are from bintrans_alpha.c. Translate them to i386. */
1033    
1034     case SPECIAL_DSLL:
1035     *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1036     break;
1037     case SPECIAL_DSLL32:
1038     sa += 32;
1039     *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1040     break;
1041     case SPECIAL_DSRA:
1042     *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1043     break;
1044     case SPECIAL_DSRA32:
1045     sa += 32;
1046     *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1047     break;
1048     case SPECIAL_DSRL:
1049     /* Note: bits of sa are distributed among two different bytes. */
1050     *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1051     break;
1052     case SPECIAL_DSRL32:
1053     /* Note: bits of sa are distributed among two different bytes. */
1054     sa += 32;
1055     *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1056     break;
1057     #endif
1058     }
1059    
1060     if (do_store)
1061     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
1062    
1063     *addrp = a;
1064     rd0:
1065     bintrans_write_pc_inc(addrp);
1066     return 1;
1067     }
1068    
1069    
1070     /*
1071     * bintrans_write_instruction__mfc_mtc():
1072     */
1073     static int bintrans_write_instruction__mfc_mtc(struct memory *mem,
1074     unsigned char **addrp, int coproc_nr, int flag64bit, int rt,
1075     int rd, int mtcflag)
1076     {
1077     unsigned char *a, *failskip;
1078     int ofs;
1079    
1080     if (mtcflag && flag64bit) {
1081     /* dmtc */
1082     return 0;
1083     }
1084    
1085     /*
1086     * NOTE: Only a few registers are readable without side effects.
1087     */
1088     if (rt == 0 && !mtcflag)
1089     return 0;
1090    
1091     if (coproc_nr >= 1)
1092     return 0;
1093    
1094     if (rd == COP0_RANDOM || rd == COP0_COUNT)
1095     return 0;
1096    
1097     a = *addrp;
1098    
1099     /*************************************************************
1100     *
1101     * TODO: Check for kernel mode, or Coproc X usability bit!
1102     *
1103     *************************************************************/
1104    
1105     /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1106     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1107     *a++ = 0x8b; *a++ = 0x96;
1108     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1109    
1110     /* here, edx = cpu->coproc[0] */
1111    
1112     if (mtcflag) {
1113     /* mtc */
1114    
1115     /* TODO: This code only works for mtc0, not dmtc0 */
1116    
1117     /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1118     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1119     *a++ = 0x8b; *a++ = 0x9a;
1120     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1121    
1122     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1123    
1124     /*
1125     * Here: eax contains the value in register rt,
1126     * ebx contains the coproc register rd value.
1127     *
1128     * In the general case, only allow mtc if it does not
1129     * change the coprocessor register!
1130     */
1131    
1132     switch (rd) {
1133    
1134     case COP0_INDEX:
1135     break;
1136    
1137     case COP0_ENTRYLO0:
1138     case COP0_ENTRYLO1:
1139     /* TODO: Not all bits are writable! */
1140     break;
1141    
1142     case COP0_EPC:
1143     break;
1144    
1145     case COP0_STATUS:
1146     /* Only allow updates to the status register if
1147     the interrupt enable bits were changed, but no
1148     other bits! */
1149     /* 89 c1 mov %eax,%ecx */
1150     /* 89 da mov %ebx,%edx */
1151     /* 81 e1 00 00 e7 0f and $0x0fe70000,%ecx */
1152     /* 81 e2 00 00 e7 0f and $0x0fe70000,%edx */
1153     /* 39 ca cmp %ecx,%edx */
1154     /* 74 01 je <ok> */
1155     *a++ = 0x89; *a++ = 0xc1;
1156     *a++ = 0x89; *a++ = 0xda;
1157     *a++ = 0x81; *a++ = 0xe1; *a++ = 0x00; *a++ = 0x00;
1158     if (mem->bintrans_32bit_only) {
1159     *a++ = 0xe7; *a++ = 0x0f;
1160     } else {
1161     *a++ = 0xff; *a++ = 0xff;
1162     }
1163     *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00;
1164     if (mem->bintrans_32bit_only) {
1165     *a++ = 0xe7; *a++ = 0x0f;
1166     } else {
1167     *a++ = 0xff; *a++ = 0xff;
1168     }
1169     *a++ = 0x39; *a++ = 0xca;
1170     *a++ = 0x74; failskip = a; *a++ = 0x00;
1171     bintrans_write_chunkreturn_fail(&a);
1172     *failskip = (size_t)a - (size_t)failskip - 1;
1173    
1174     /* Only allow the update if it would NOT cause
1175     an interrupt exception: */
1176    
1177     /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1178     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1179     *a++ = 0x8b; *a++ = 0x96;
1180     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1181    
1182     /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1183     ofs = ((size_t)&dummy_coproc.reg[COP0_CAUSE]) - (size_t)&dummy_coproc;
1184     *a++ = 0x8b; *a++ = 0x9a;
1185     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1186    
1187     /* 21 c3 and %eax,%ebx */
1188     /* 81 e3 00 ff 00 00 and $0xff00,%ebx */
1189     /* 83 fb 00 cmp $0x0,%ebx */
1190     /* 74 01 je <ok> */
1191     *a++ = 0x21; *a++ = 0xc3;
1192     *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00;
1193     *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
1194     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
1195     *a++ = 0x74; failskip = a; *a++ = 0x00;
1196     bintrans_write_chunkreturn_fail(&a);
1197     *failskip = (size_t)a - (size_t)failskip - 1;
1198    
1199     break;
1200    
1201     default:
1202     /* 39 d8 cmp %ebx,%eax */
1203     /* 74 01 je <ok> */
1204     *a++ = 0x39; *a++ = 0xd8;
1205     *a++ = 0x74; failskip = a; *a++ = 0x00;
1206     bintrans_write_chunkreturn_fail(&a);
1207     *failskip = (size_t)a - (size_t)failskip - 1;
1208     }
1209    
1210     /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1211     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1212     *a++ = 0x8b; *a++ = 0x96;
1213     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1214    
1215     /* 8d 9a 38 30 00 00 lea 0x3038(%edx),%ebx */
1216     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1217     *a++ = 0x8d; *a++ = 0x9a;
1218     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1219    
1220     /* Sign-extend eax into edx:eax, and store it in
1221     coprocessor register rd: */
1222     /* 99 cltd */
1223     *a++ = 0x99;
1224    
1225     /* 89 03 mov %eax,(%ebx) */
1226     /* 89 53 04 mov %edx,0x4(%ebx) */
1227     *a++ = 0x89; *a++ = 0x03;
1228     *a++ = 0x89; *a++ = 0x53; *a++ = 0x04;
1229     } else {
1230     /* mfc */
1231    
1232     /* 8b 82 38 30 00 00 mov 0x3038(%edx),%eax */
1233     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1234     *a++ = 0x8b; *a++ = 0x82;
1235     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1236    
1237     if (flag64bit) {
1238     /* Load high 32 bits: (note: edx gets overwritten) */
1239     /* 8b 92 3c 30 00 00 mov 0x303c(%edx),%edx */
1240     ofs += 4;
1241     *a++ = 0x8b; *a++ = 0x92;
1242     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1243     } else {
1244     /* 99 cltd */
1245     *a++ = 0x99;
1246     }
1247    
1248     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1249     }
1250    
1251     *addrp = a;
1252     bintrans_write_pc_inc(addrp);
1253     return 1;
1254     }
1255    
1256    
1257     /*
1258     * bintrans_write_instruction__branch():
1259     */
1260     static int bintrans_write_instruction__branch(unsigned char **addrp,
1261     int instruction_type, int regimm_type, int rt, int rs, int imm)
1262     {
1263     unsigned char *a;
1264     unsigned char *skip1 = NULL, *skip2 = NULL;
1265     int ofs, likely = 0;
1266    
1267     switch (instruction_type) {
1268     case HI6_BEQL:
1269     case HI6_BNEL:
1270     case HI6_BLEZL:
1271     case HI6_BGTZL:
1272     likely = 1;
1273     }
1274    
1275     /* TODO: See the Alpha backend on how these could be implemented: */
1276     if (likely)
1277     return 0;
1278    
1279     a = *addrp;
1280    
1281     /*
1282     * edx:eax = gpr[rs]; ecx:ebx = gpr[rt];
1283     *
1284     * Compare for equality (BEQ).
1285     * If the result was zero, then it means equality; perform the
1286     * delayed jump. Otherwise: skip.
1287     */
1288    
1289     switch (instruction_type) {
1290     case HI6_BEQ:
1291     case HI6_BNE:
1292     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1293     /* 89 c3 mov %eax,%ebx */
1294     /* 89 d1 mov %edx,%ecx */
1295     *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
1296     }
1297     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1298    
1299     if (instruction_type == HI6_BEQ && rt != rs) {
1300     /* If rt != rs, then skip. */
1301     /* 39 c3 cmp %eax,%ebx */
1302     /* 75 05 jne 155 <skip> */
1303     /* 39 d1 cmp %edx,%ecx */
1304     /* 75 01 jne 155 <skip> */
1305     *a++ = 0x39; *a++ = 0xc3;
1306     *a++ = 0x75; skip1 = a; *a++ = 0x00;
1307     #if 0
1308     if (!bintrans_32bit_only)
1309     #endif
1310     {
1311     *a++ = 0x39; *a++ = 0xd1;
1312     *a++ = 0x75; skip2 = a; *a++ = 0x00;
1313     }
1314     }
1315    
1316     if (instruction_type == HI6_BNE) {
1317     /* If rt != rs, then ok. Otherwise skip. */
1318     #if 0
1319     if (bintrans_32bit_only) {
1320     /* 39 c3 cmp %eax,%ebx */
1321     /* 74 xx je <skip> */
1322     *a++ = 0x39; *a++ = 0xc3;
1323     *a++ = 0x74; skip2 = a; *a++ = 0x00;
1324     } else
1325     #endif
1326     {
1327     /* 39 c3 cmp %eax,%ebx */
1328     /* 75 06 jne 156 <bra> */
1329     /* 39 d1 cmp %edx,%ecx */
1330     /* 75 02 jne 156 <bra> */
1331     /* eb 01 jmp 157 <skip> */
1332     *a++ = 0x39; *a++ = 0xc3;
1333     *a++ = 0x75; *a++ = 0x06;
1334     *a++ = 0x39; *a++ = 0xd1;
1335     *a++ = 0x75; *a++ = 0x02;
1336     *a++ = 0xeb; skip2 = a; *a++ = 0x00;
1337     }
1338     }
1339    
1340     if (instruction_type == HI6_BLEZ) {
1341     /* If both eax and edx are zero, then do the branch. */
1342     /* 83 f8 00 cmp $0x0,%eax */
1343     /* 75 07 jne <nott> */
1344     /* 83 fa 00 cmp $0x0,%edx */
1345     /* 75 02 jne 23d <nott> */
1346     /* eb 01 jmp <branch> */
1347     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1348     *a++ = 0x75; *a++ = 0x07;
1349     *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1350     *a++ = 0x75; *a++ = 0x02;
1351     *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1352    
1353     /* If high bit of edx is set, then rs < 0. */
1354     /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1355     /* 74 00 jz skip */
1356     *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1357     *a++ = 0x74; skip2 = a; *a++ = 0x00;
1358    
1359     if (skip1 != NULL)
1360     *skip1 = (size_t)a - (size_t)skip1 - 1;
1361     skip1 = NULL;
1362     }
1363     if (instruction_type == HI6_BGTZ) {
1364     /* If both eax and edx are zero, then skip the branch. */
1365     /* 83 f8 00 cmp $0x0,%eax */
1366     /* 75 07 jne <nott> */
1367     /* 83 fa 00 cmp $0x0,%edx */
1368     /* 75 02 jne 23d <nott> */
1369     /* eb 01 jmp <skip> */
1370     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1371     *a++ = 0x75; *a++ = 0x07;
1372     *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1373     *a++ = 0x75; *a++ = 0x02;
1374     *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1375    
1376     /* If high bit of edx is set, then rs < 0. */
1377     /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1378     /* 75 00 jnz skip */
1379     *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1380     *a++ = 0x75; skip2 = a; *a++ = 0x00;
1381     }
1382     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BLTZ) {
1383     /* If high bit of edx is set, then rs < 0. */
1384     /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1385     /* 74 00 jz skip */
1386     *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1387     *a++ = 0x74; skip2 = a; *a++ = 0x00;
1388     }
1389     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BGEZ) {
1390     /* If high bit of edx is not set, then rs >= 0. */
1391     /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1392     /* 75 00 jnz skip */
1393     *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1394     *a++ = 0x75; skip2 = a; *a++ = 0x00;
1395     }
1396    
1397     /*
1398     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
1399     * and cpu->delay_jmpaddr = pc + 4 + (imm << 2).
1400     */
1401    
1402     /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
1403     ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1404     *a++ = 0xc7; *a++ = 0x86;
1405     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1406     *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
1407    
1408     load_pc_into_eax_edx(&a);
1409    
1410     /* 05 78 56 34 12 add $0x12345678,%eax */
1411     /* 83 d2 00 adc $0x0,%edx */
1412     /* or */
1413     /* 83 d2 ff adc $0xffffffff,%edx */
1414     imm = (imm << 2) + 4;
1415     *a++ = 0x05; *a++ = imm; *a++ = imm >> 8; *a++ = imm >> 16; *a++ = imm >> 24;
1416     if (imm >= 0) {
1417     *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
1418     } else {
1419     *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1420     }
1421     store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1422    
1423     if (skip1 != NULL)
1424     *skip1 = (size_t)a - (size_t)skip1 - 1;
1425     if (skip2 != NULL)
1426     *skip2 = (size_t)a - (size_t)skip2 - 1;
1427    
1428     *addrp = a;
1429     bintrans_write_pc_inc(addrp);
1430     return 1;
1431     }
1432    
1433    
1434     /*
1435     * bintrans_write_instruction__delayedbranch():
1436     */
1437     static int bintrans_write_instruction__delayedbranch(struct memory *mem,
1438     unsigned char **addrp, uint32_t *potential_chunk_p, uint32_t *chunks,
1439     int only_care_about_chunk_p, int p, int forward)
1440     {
1441     unsigned char *a, *skip=NULL, *failskip;
1442     int ofs;
1443     uint32_t i386_addr;
1444    
1445     a = *addrp;
1446    
1447     if (only_care_about_chunk_p)
1448     goto try_chunk_p;
1449    
1450     /* Skip all of this if there is no branch: */
1451     ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1452    
1453     /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
1454     *a++ = 0x8b; *a++ = 0x86;
1455     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1456    
1457     /* 83 f8 00 cmp $0x0,%eax */
1458     /* 74 01 je 16b <skippa> */
1459     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1460     *a++ = 0x74; skip = a; *a++ = 0;
1461    
1462     /*
1463     * Perform the jump by setting cpu->delay_slot = 0
1464     * and pc = cpu->delay_jmpaddr.
1465     */
1466    
1467     /* c7 86 38 30 00 00 00 00 00 00 movl $0x0,0x3038(%esi) */
1468     ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1469     *a++ = 0xc7; *a++ = 0x86;
1470     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1471     *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0;
1472    
1473     /* REMEMBER old pc: */
1474     load_pc_into_eax_edx(&a);
1475     /* 89 c3 mov %eax,%ebx */
1476     /* 89 d1 mov %edx,%ecx */
1477     *a++ = 0x89; *a++ = 0xc3;
1478     *a++ = 0x89; *a++ = 0xd1;
1479     load_into_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1480     store_eax_edx_into_pc(&a);
1481    
1482     try_chunk_p:
1483    
1484     if (potential_chunk_p == NULL) {
1485     if (mem->bintrans_32bit_only) {
1486     #if 1
1487     /* 8b 86 78 56 34 12 mov 0x12345678(%esi),%eax */
1488     /* ff e0 jmp *%eax */
1489     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc) - (size_t)&dummy_cpu;
1490     *a++ = 0x8b; *a++ = 0x86;
1491     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1492     *a++ = 0xff; *a++ = 0xe0;
1493    
1494     #else
1495     /* Don't execute too many instructions. */
1496     /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1497     /* 7c 01 jl <okk> */
1498     /* c3 ret */
1499     *a++ = 0x81; *a++ = 0xfd;
1500     *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1501     *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1502     *a++ = 0x7c; failskip = a; *a++ = 0x01;
1503     bintrans_write_chunkreturn_fail(&a);
1504     *failskip = (size_t)a - (size_t)failskip - 1;
1505    
1506     /*
1507     * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
1508     *
1509     * 89 c3 mov %eax,%ebx
1510     * c1 eb 14 shr $20,%ebx
1511     * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1512     */
1513     *a++ = 0x89; *a++ = 0xc3;
1514     *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x14;
1515     *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1516    
1517     /*
1518     * ecx = vaddr_to_hostaddr_table0
1519     *
1520     * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
1521     */
1522     ofs = ((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - (size_t)&dummy_cpu;
1523     *a++ = 0x8b; *a++ = 0x8e;
1524     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1525    
1526     /*
1527     * ecx = vaddr_to_hostaddr_table0[a]
1528     *
1529     * 8b 0c 19 mov (%ecx,%ebx),%ecx
1530     */
1531     *a++ = 0x8b; *a++ = 0x0c; *a++ = 0x19;
1532    
1533     /*
1534     * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
1535     *
1536     * 89 c3 mov %eax,%ebx
1537     * c1 eb 0a shr $10,%ebx
1538     * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1539     */
1540     *a++ = 0x89; *a++ = 0xc3;
1541     *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x0a;
1542     *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1543    
1544     /*
1545     * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
1546     *
1547     * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
1548     */
1549     ofs = (size_t)&dummy_vth32_table.cd.mips.bintrans_chunks[0]
1550     - (size_t)&dummy_vth32_table;
1551    
1552     *a++ = 0x8b; *a++ = 0x8c; *a++ = 0x19;
1553     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1554    
1555     /*
1556     * ecx = NULL? Then return with failure.
1557     *
1558     * 83 f9 00 cmp $0x0,%ecx
1559     * 75 01 jne <okzzz>
1560     */
1561     *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1562     *a++ = 0x75; fail = a; *a++ = 0x00;
1563     bintrans_write_chunkreturn(&a);
1564     *fail = (size_t)a - (size_t)fail - 1;
1565    
1566     /*
1567     * 25 fc 0f 00 00 and $0xffc,%eax
1568     * 01 c1 add %eax,%ecx
1569     *
1570     * 8b 01 mov (%ecx),%eax
1571     *
1572     * 83 f8 00 cmp $0x0,%eax
1573     * 75 01 jne <ok>
1574     * c3 ret
1575     */
1576     *a++ = 0x25; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1577     *a++ = 0x01; *a++ = 0xc1;
1578    
1579     *a++ = 0x8b; *a++ = 0x01;
1580    
1581     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1582     *a++ = 0x75; fail = a; *a++ = 0x01;
1583     bintrans_write_chunkreturn(&a);
1584     *fail = (size_t)a - (size_t)fail - 1;
1585    
1586     /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1587     /* ff e0 jmp *%eax */
1588     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1589     *a++ = 0x03; *a++ = 0x86;
1590     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1591     *a++ = 0xff; *a++ = 0xe0;
1592     #endif
1593     } else {
1594     /* Not much we can do here if this wasn't to the same physical page... */
1595    
1596     /* Don't execute too many instructions. */
1597     /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1598     /* 7c 01 jl <okk> */
1599     /* c3 ret */
1600     *a++ = 0x81; *a++ = 0xfd;
1601     *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1602     *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1603     *a++ = 0x7c; failskip = a; *a++ = 0x01;
1604     bintrans_write_chunkreturn_fail(&a);
1605     *failskip = (size_t)a - (size_t)failskip - 1;
1606    
1607     /*
1608     * Compare the old pc (ecx:ebx) and the new pc (edx:eax). If they are on the
1609     * same virtual page (which means that they are on the same physical
1610     * page), then we can check the right chunk pointer, and if it
1611     * is non-NULL, then we can jump there. Otherwise just return.
1612     */
1613    
1614     /* Subtract 4 from the old pc first. (This is where the jump originated from.) */
1615     /* 83 eb 04 sub $0x4,%ebx */
1616     /* 83 d9 00 sbb $0x0,%ecx */
1617     *a++ = 0x83; *a++ = 0xeb; *a++ = 0x04;
1618     *a++ = 0x83; *a++ = 0xd9; *a++ = 0x00;
1619    
1620     /* 39 d1 cmp %edx,%ecx */
1621     /* 74 01 je 1b9 <ok2> */
1622     /* c3 ret */
1623     *a++ = 0x39; *a++ = 0xd1;
1624     *a++ = 0x74; *a++ = 0x01;
1625     *a++ = 0xc3;
1626    
1627     /* Remember new pc: */
1628     /* 89 c1 mov %eax,%ecx */
1629     *a++ = 0x89; *a++ = 0xc1;
1630    
1631     /* 81 e3 00 f0 ff ff and $0xfffff000,%ebx */
1632     /* 25 00 f0 ff ff and $0xfffff000,%eax */
1633     *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1634     *a++ = 0x25; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1635    
1636     /* 39 c3 cmp %eax,%ebx */
1637     /* 74 01 je <ok1> */
1638     /* c3 ret */
1639     *a++ = 0x39; *a++ = 0xc3;
1640     *a++ = 0x74; *a++ = 0x01;
1641     *a++ = 0xc3;
1642    
1643     /* 81 e1 ff 0f 00 00 and $0xfff,%ecx */
1644     *a++ = 0x81; *a++ = 0xe1; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1645    
1646     /* 8b 81 78 56 34 12 mov 0x12345678(%ecx),%eax */
1647     ofs = (size_t)chunks;
1648     *a++ = 0x8b; *a++ = 0x81; *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1649    
1650     /* 83 f8 00 cmp $0x0,%eax */
1651     /* 75 01 jne 1cd <okjump> */
1652     /* c3 ret */
1653     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1654     *a++ = 0x75; *a++ = 0x01;
1655     *a++ = 0xc3;
1656    
1657     /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1658     /* ff e0 jmp *%eax */
1659     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1660     *a++ = 0x03; *a++ = 0x86;
1661     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1662     *a++ = 0xff; *a++ = 0xe0;
1663     }
1664     } else {
1665     /*
1666     * Just to make sure that we don't become too unreliant
1667     * on the main program loop, we need to return every once
1668     * in a while (interrupts etc).
1669     *
1670     * Load the "nr of instructions executed" (which is an int)
1671     * and see if it is below a certain threshold. If so, then
1672     * we go on with the fast path (bintrans), otherwise we
1673     * abort by returning.
1674     */
1675     /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1676     /* 7c 01 jl <okk> */
1677     /* c3 ret */
1678     if (!only_care_about_chunk_p && !forward) {
1679     *a++ = 0x81; *a++ = 0xfd;
1680     *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1681     *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1682     *a++ = 0x7c; failskip = a; *a++ = 0x01;
1683     bintrans_write_chunkreturn_fail(&a);
1684     *failskip = (size_t)a - (size_t)failskip - 1;
1685     }
1686    
1687     /*
1688     * potential_chunk_p points to an "uint32_t".
1689     * If this value is non-NULL, then it is a piece of i386
1690     * machine language code corresponding to the address
1691     * we're jumping to. Otherwise, those instructions haven't
1692     * been translated yet, so we have to return to the main
1693     * loop. (Actually, we have to add cpu->chunk_base_address.)
1694     *
1695     * Case 1: The value is non-NULL already at translation
1696     * time. Then we can make a direct (fast) native
1697     * i386 jump to the code chunk.
1698     *
1699     * Case 2: The value was NULL at translation time, then we
1700     * have to check during runtime.
1701     */
1702    
1703     /* Case 1: */
1704     /* printf("%08x ", *potential_chunk_p); */
1705     i386_addr = *potential_chunk_p +
1706     (size_t)mem->translation_code_chunk_space;
1707     i386_addr = i386_addr - ((size_t)a + 5);
1708     if ((*potential_chunk_p) != 0) {
1709     *a++ = 0xe9;
1710     *a++ = i386_addr;
1711     *a++ = i386_addr >> 8;
1712     *a++ = i386_addr >> 16;
1713     *a++ = i386_addr >> 24;
1714     } else {
1715     /* Case 2: */
1716    
1717     bintrans_register_potential_quick_jump(mem, a, p);
1718    
1719     i386_addr = (size_t)potential_chunk_p;
1720    
1721     /*
1722     * Load the chunk pointer into eax.
1723     * If it is NULL (zero), then skip the following jump.
1724     * Add chunk_base_address to eax, and jump to eax.
1725     */
1726    
1727     /* a1 78 56 34 12 mov 0x12345678,%eax */
1728     /* 83 f8 00 cmp $0x0,%eax */
1729     /* 75 01 jne <okaa> */
1730     /* c3 ret */
1731     *a++ = 0xa1;
1732     *a++ = i386_addr; *a++ = i386_addr >> 8;
1733     *a++ = i386_addr >> 16; *a++ = i386_addr >> 24;
1734     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1735     *a++ = 0x75; *a++ = 0x01;
1736     *a++ = 0xc3;
1737    
1738     /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1739     /* ff e0 jmp *%eax */
1740     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1741     *a++ = 0x03; *a++ = 0x86;
1742     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1743     *a++ = 0xff; *a++ = 0xe0;
1744     }
1745     }
1746    
1747     if (skip != NULL)
1748     *skip = (size_t)a - (size_t)skip - 1;
1749    
1750     *addrp = a;
1751     return 1;
1752     }
1753    
1754    
1755     /*
1756     * bintrans_write_instruction__loadstore():
1757     */
1758     static int bintrans_write_instruction__loadstore(struct memory *mem,
1759     unsigned char **addrp, int rt, int imm, int rs,
1760 dpavlin 12 int instruction_type, int bigendian, int do_alignment_check)
1761 dpavlin 2 {
1762     unsigned char *a, *retfail, *generic64bit, *doloadstore,
1763     *okret0, *okret1, *okret2, *skip;
1764     int ofs, alignment, load=0, unaligned=0;
1765    
1766     /* TODO: Not yet: */
1767     if (instruction_type == HI6_LQ_MDMX || instruction_type == HI6_SQ)
1768     return 0;
1769    
1770     switch (instruction_type) {
1771     case HI6_LQ_MDMX:
1772     case HI6_LDL:
1773     case HI6_LDR:
1774     case HI6_LD:
1775     case HI6_LWU:
1776     case HI6_LWL:
1777     case HI6_LWR:
1778     case HI6_LW:
1779     case HI6_LHU:
1780     case HI6_LH:
1781     case HI6_LBU:
1782     case HI6_LB:
1783     load = 1;
1784     if (rt == 0)
1785     return 0;
1786     }
1787    
1788     switch (instruction_type) {
1789     case HI6_LWL:
1790     case HI6_LWR:
1791     case HI6_LDL:
1792     case HI6_LDR:
1793     case HI6_SWL:
1794     case HI6_SWR:
1795     case HI6_SDL:
1796     case HI6_SDR:
1797     unaligned = 1;
1798     }
1799    
1800 dpavlin 12 /* TODO: Not yet: */
1801     if (bigendian && unaligned)
1802     return 0;
1803    
1804 dpavlin 2 a = *addrp;
1805    
1806     if (mem->bintrans_32bit_only)
1807     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1808     else
1809     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1810    
1811 dpavlin 12 if (imm != 0) {
1812     if (imm & 0x8000) {
1813     /* 05 34 f2 ff ff add $0xfffff234,%eax */
1814     /* 83 d2 ff adc $0xffffffff,%edx */
1815     *a++ = 5;
1816     *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
1817     if (!mem->bintrans_32bit_only) {
1818     *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1819     }
1820     } else {
1821     /* 05 34 12 00 00 add $0x1234,%eax */
1822     /* 83 d2 00 adc $0x0,%edx */
1823     *a++ = 5;
1824     *a++ = imm; *a++ = imm >> 8; *a++ = 0; *a++ = 0;
1825     if (!mem->bintrans_32bit_only) {
1826     *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
1827     }
1828 dpavlin 2 }
1829     }
1830    
1831     alignment = 0;
1832     switch (instruction_type) {
1833     case HI6_LQ_MDMX:
1834     case HI6_SQ:
1835     alignment = 15;
1836     break;
1837     case HI6_LD:
1838     case HI6_LDL:
1839     case HI6_LDR:
1840     case HI6_SD:
1841     case HI6_SDL:
1842     case HI6_SDR:
1843     alignment = 7;
1844     break;
1845     case HI6_LW:
1846     case HI6_LWL:
1847     case HI6_LWR:
1848     case HI6_LWU:
1849     case HI6_SW:
1850     case HI6_SWL:
1851     case HI6_SWR:
1852     alignment = 3;
1853     break;
1854     case HI6_LH:
1855     case HI6_LHU:
1856     case HI6_SH:
1857     alignment = 1;
1858     break;
1859     }
1860    
1861     if (unaligned) {
1862     /*
1863     * Perform the actual load/store from an
1864     * aligned address.
1865     *
1866     * 83 e0 fc and $0xfffffffc,%eax
1867     */
1868     *a++ = 0x83; *a++ = 0xe0; *a++ = 0xff - alignment;
1869 dpavlin 12 } else if (alignment > 0 && do_alignment_check) {
1870 dpavlin 2 unsigned char *alignskip;
1871     /*
1872     * Check alignment:
1873     *
1874     * 89 c3 mov %eax,%ebx
1875     * 83 e3 01 and $0x1,%ebx
1876     * 74 01 jz <ok>
1877     * c3 ret
1878     */
1879     *a++ = 0x89; *a++ = 0xc3;
1880     *a++ = 0x83; *a++ = 0xe3; *a++ = alignment;
1881     *a++ = 0x74; alignskip = a; *a++ = 0x00;
1882     bintrans_write_chunkreturn_fail(&a);
1883     *alignskip = (size_t)a - (size_t)alignskip - 1;
1884     }
1885    
1886    
1887     /* Here, edx:eax = vaddr */
1888    
1889     if (mem->bintrans_32bit_only) {
1890 dpavlin 12 /* ebx = vaddr >> 12; */
1891     *a++ = 0x89; *a++ = 0xc3; /* mov %eax, %ebx */
1892     *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x0c; /* shr $12, %ebx */
1893 dpavlin 2
1894 dpavlin 12 if (load) {
1895     /* ecx = cpu->cd.mips.host_load */
1896     *a++ = 0x8b; *a++ = 0x8e; *a++ = ofs_h_l & 255;
1897     *a++ = (ofs_h_l >> 8) & 255;
1898     *a++ = (ofs_h_l >> 16) & 255; *a++ = (ofs_h_l >> 24) & 255;
1899     } else {
1900     /* ecx = cpu->cd.mips.host_store */
1901     *a++ = 0x8b; *a++ = 0x8e; *a++ = ofs_h_s & 255;
1902     *a++ = (ofs_h_s >> 8) & 255;
1903     *a++ = (ofs_h_s >> 16) & 255; *a++ = (ofs_h_s >> 24) & 255;
1904     }
1905    
1906     /* ecx = host_load[a] (or host_store[a]) */
1907     *a++ = 0x8b; *a++ = 0x0c; *a++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
1908    
1909 dpavlin 2 /*
1910     * ecx = NULL? Then return with failure.
1911     *
1912     * 83 f9 00 cmp $0x0,%ecx
1913     * 75 01 jne <okzzz>
1914     */
1915     *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1916     *a++ = 0x75; retfail = a; *a++ = 0x00;
1917     bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1918     *retfail = (size_t)a - (size_t)retfail - 1;
1919    
1920     /*
1921     * eax = offset within page = vaddr & 0xfff
1922     * ecx = host address ( = host page + offset)
1923     *
1924 dpavlin 12 * 25 ff 0f 00 00 and $0xfff,%eax
1925 dpavlin 2 * 01 c1 add %eax,%ecx
1926     */
1927 dpavlin 12 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1928 dpavlin 2 *a++ = 0x01; *a++ = 0xc1;
1929     } else {
1930     /*
1931     * If the load/store address has the top 32 bits set to
1932     * 0x00000000 or 0xffffffff, then we can use the 32-bit
1933     * lookup tables:
1934     *
1935    
1936     TODO: top 33 bits!!!!!!!
1937    
1938     * 83 fa 00 cmp $0x0,%edx
1939     * 74 05 je <ok32>
1940     * 83 fa ff cmp $0xffffffff,%edx
1941     * 75 01 jne <not32>
1942     */
1943     *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1944     *a++ = 0x74; *a++ = 0x05;
1945     *a++ = 0x83; *a++ = 0xfa; *a++ = 0xff;
1946     *a++ = 0x75; generic64bit = a; *a++ = 0x01;
1947    
1948     /* Call the quick lookup routine: */
1949 dpavlin 10 if (load)
1950     ofs = (size_t)bintrans_load_32bit;
1951     else
1952     ofs = (size_t)bintrans_store_32bit;
1953 dpavlin 2 ofs = ofs - ((size_t)a + 5);
1954     *a++ = 0xe8; *a++ = ofs; *a++ = ofs >> 8;
1955     *a++ = ofs >> 16; *a++ = ofs >> 24;
1956    
1957     /*
1958     * ecx = NULL? Then return with failure.
1959     *
1960     * 83 f9 00 cmp $0x0,%ecx
1961     * 75 01 jne <okzzz>
1962     */
1963     *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1964     *a++ = 0x75; retfail = a; *a++ = 0x00;
1965     bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1966     *retfail = (size_t)a - (size_t)retfail - 1;
1967    
1968     /*
1969     * eax = offset within page = vaddr & 0xfff
1970     * ecx = host address ( = host page + offset)
1971     *
1972 dpavlin 12 * 25 ff 0f 00 00 and $0xfff,%eax
1973 dpavlin 2 * 01 c1 add %eax,%ecx
1974     */
1975 dpavlin 12 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1976 dpavlin 2 *a++ = 0x01; *a++ = 0xc1;
1977    
1978     *a++ = 0xeb; doloadstore = a; *a++ = 0x01;
1979    
1980    
1981     /* TODO: The stuff above is so similar to the pure 32-bit
1982     case that it should be factored out. */
1983    
1984    
1985     *generic64bit = (size_t)a - (size_t)generic64bit - 1;
1986    
1987     /*
1988     * 64-bit generic case:
1989     */
1990    
1991     /* push writeflag */
1992     *a++ = 0x6a; *a++ = load? 0 : 1;
1993    
1994     /* push vaddr (edx:eax) */
1995     *a++ = 0x52; *a++ = 0x50;
1996    
1997     /* push cpu (esi) */
1998     *a++ = 0x56;
1999    
2000     /* eax = points to the right function */
2001     ofs = ((size_t)&dummy_cpu.cd.mips.fast_vaddr_to_hostaddr) - (size_t)&dummy_cpu;
2002     *a++ = 0x8b; *a++ = 0x86;
2003     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2004    
2005     /* ff d0 call *%eax */
2006     *a++ = 0xff; *a++ = 0xd0;
2007    
2008     /* 83 c4 08 add $0x10,%esp */
2009     *a++ = 0x83; *a++ = 0xc4; *a++ = 0x10;
2010    
2011     /* If eax is NULL, then return. */
2012     /* 83 f8 00 cmp $0x0,%eax */
2013     /* 75 01 jne 1cd <okjump> */
2014     /* c3 ret */
2015     *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
2016     *a++ = 0x75; retfail = a; *a++ = 0x00;
2017     bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2018     *retfail = (size_t)a - (size_t)retfail - 1;
2019    
2020     /* 89 c1 mov %eax,%ecx */
2021     *a++ = 0x89; *a++ = 0xc1;
2022    
2023     *doloadstore = (size_t)a - (size_t)doloadstore - 1;
2024     }
2025    
2026    
2027     if (!load) {
2028 dpavlin 12 if (rt == MIPS_GPR_ZERO) {
2029     switch (alignment) {
2030     case 7: *a++ = 0x31; *a++ = 0xd2; /* edx = 0 */
2031     case 3:
2032     case 1: *a++ = 0x31; *a++ = 0xc0; break; /* eax = 0 */
2033     case 0: *a++ = 0xb0; *a++ = 0x00; break; /* al = 0 */
2034     default:fatal("todo: zero\n"); exit(1);
2035     }
2036     } else {
2037     if (alignment >= 7)
2038     load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2039     else
2040     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2041     }
2042 dpavlin 2 }
2043    
2044     switch (instruction_type) {
2045     case HI6_LD:
2046 dpavlin 12 if (bigendian) {
2047     /* 8b 11 mov (%ecx),%edx */
2048     /* 8b 41 04 mov 0x4(%ecx),%eax */
2049     /* 0f c8 bswap %eax */
2050     /* 0f ca bswap %edx */
2051     *a++ = 0x8b; *a++ = 0x11;
2052     *a++ = 0x8b; *a++ = 0x41; *a++ = 0x04;
2053     *a++ = 0x0f; *a++ = 0xc8;
2054     *a++ = 0x0f; *a++ = 0xca;
2055     } else {
2056     /* 8b 01 mov (%ecx),%eax */
2057     /* 8b 51 04 mov 0x4(%ecx),%edx */
2058     *a++ = 0x8b; *a++ = 0x01;
2059     *a++ = 0x8b; *a++ = 0x51; *a++ = 0x04;
2060     }
2061 dpavlin 2 break;
2062     case HI6_LWU:
2063     /* 8b 01 mov (%ecx),%eax */
2064 dpavlin 12 /* 0f c8 bswap %eax (big endian) */
2065 dpavlin 2 /* 31 d2 xor %edx,%edx */
2066     *a++ = 0x8b; *a++ = 0x01;
2067 dpavlin 12 if (bigendian) {
2068     *a++ = 0x0f; *a++ = 0xc8;
2069     }
2070 dpavlin 2 *a++ = 0x31; *a++ = 0xd2;
2071     break;
2072     case HI6_LW:
2073     /* 8b 01 mov (%ecx),%eax */
2074 dpavlin 12 /* 0f c8 bswap %eax (big endian) */
2075 dpavlin 2 /* 99 cltd */
2076     *a++ = 0x8b; *a++ = 0x01;
2077 dpavlin 12 if (bigendian) {
2078     *a++ = 0x0f; *a++ = 0xc8;
2079     }
2080 dpavlin 2 *a++ = 0x99;
2081     break;
2082     case HI6_LHU:
2083     /* 31 c0 xor %eax,%eax */
2084     /* 66 8b 01 mov (%ecx),%ax */
2085 dpavlin 12 /* 86 c4 xchg %al,%ah (big endian) */
2086 dpavlin 2 /* 99 cltd */
2087     *a++ = 0x31; *a++ = 0xc0;
2088     *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2089 dpavlin 12 if (bigendian) {
2090     *a++ = 0x86; *a++ = 0xc4;
2091     }
2092 dpavlin 2 *a++ = 0x99;
2093     break;
2094     case HI6_LH:
2095     /* 66 8b 01 mov (%ecx),%ax */
2096 dpavlin 12 /* 86 c4 xchg %al,%ah (big endian) */
2097 dpavlin 2 /* 98 cwtl */
2098     /* 99 cltd */
2099     *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2100 dpavlin 12 if (bigendian) {
2101     *a++ = 0x86; *a++ = 0xc4;
2102     }
2103 dpavlin 2 *a++ = 0x98;
2104     *a++ = 0x99;
2105     break;
2106     case HI6_LBU:
2107     /* 31 c0 xor %eax,%eax */
2108     /* 8a 01 mov (%ecx),%al */
2109     /* 99 cltd */
2110     *a++ = 0x31; *a++ = 0xc0;
2111     *a++ = 0x8a; *a++ = 0x01;
2112     *a++ = 0x99;
2113     break;
2114     case HI6_LB:
2115     /* 8a 01 mov (%ecx),%al */
2116     /* 66 98 cbtw */
2117     /* 98 cwtl */
2118     /* 99 cltd */
2119     *a++ = 0x8a; *a++ = 0x01;
2120     *a++ = 0x66; *a++ = 0x98;
2121     *a++ = 0x98;
2122     *a++ = 0x99;
2123     break;
2124    
2125     case HI6_LWL:
2126     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2127     /* 05 34 f2 ff ff add $0xfffff234,%eax */
2128     *a++ = 5;
2129     *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2130     /* 83 e0 03 and $0x03,%eax */
2131     *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2132     /* 89 c3 mov %eax,%ebx */
2133     *a++ = 0x89; *a++ = 0xc3;
2134    
2135     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2136    
2137     /* ALIGNED LOAD: */
2138     /* 8b 11 mov (%ecx),%edx */
2139     *a++ = 0x8b; *a++ = 0x11;
2140    
2141     /*
2142     * CASE 0:
2143     * memory = 0x12 0x34 0x56 0x78
2144     * register after lwl: 0x12 0x.. 0x.. 0x..
2145     */
2146     /* 83 fb 00 cmp $0x0,%ebx */
2147     /* 75 01 jne <skip> */
2148     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2149     *a++ = 0x75; skip = a; *a++ = 0x01;
2150    
2151     /* c1 e2 18 shl $0x18,%edx */
2152     /* 25 ff ff ff 00 and $0xffffff,%eax */
2153     /* 09 d0 or %edx,%eax */
2154     *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x18;
2155     *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2156     *a++ = 0x09; *a++ = 0xd0;
2157    
2158     /* eb 00 jmp <okret> */
2159     *a++ = 0xeb; okret0 = a; *a++ = 0;
2160    
2161     *skip = (size_t)a - (size_t)skip - 1;
2162    
2163     /*
2164     * CASE 1:
2165     * memory = 0x12 0x34 0x56 0x78
2166     * register after lwl: 0x34 0x12 0x.. 0x..
2167     */
2168     /* 83 fb 01 cmp $0x1,%ebx */
2169     /* 75 01 jne <skip> */
2170     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2171     *a++ = 0x75; skip = a; *a++ = 0x01;
2172    
2173     /* c1 e2 10 shl $0x10,%edx */
2174     /* 25 ff ff 00 00 and $0xffff,%eax */
2175     /* 09 d0 or %edx,%eax */
2176     *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x10;
2177     *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2178     *a++ = 0x09; *a++ = 0xd0;
2179    
2180     /* eb 00 jmp <okret> */
2181     *a++ = 0xeb; okret1 = a; *a++ = 0;
2182    
2183     *skip = (size_t)a - (size_t)skip - 1;
2184    
2185     /*
2186     * CASE 2:
2187     * memory = 0x12 0x34 0x56 0x78
2188     * register after lwl: 0x56 0x34 0x12 0x..
2189     */
2190     /* 83 fb 02 cmp $0x2,%ebx */
2191     /* 75 01 jne <skip> */
2192     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2193     *a++ = 0x75; skip = a; *a++ = 0x01;
2194    
2195     /* c1 e2 08 shl $0x08,%edx */
2196     /* 25 ff 00 00 00 and $0xff,%eax */
2197     /* 09 d0 or %edx,%eax */
2198     *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x08;
2199     *a++ = 0x25; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2200     *a++ = 0x09; *a++ = 0xd0;
2201    
2202     /* eb 00 jmp <okret> */
2203     *a++ = 0xeb; okret2 = a; *a++ = 0;
2204    
2205     *skip = (size_t)a - (size_t)skip - 1;
2206    
2207     /*
2208     * CASE 3:
2209     * memory = 0x12 0x34 0x56 0x78
2210     * register after lwl: 0x78 0x56 0x34 0x12
2211     */
2212     /* 89 d0 mov %edx,%eax */
2213     *a++ = 0x89; *a++ = 0xd0;
2214    
2215     /* okret: */
2216     *okret0 = (size_t)a - (size_t)okret0 - 1;
2217     *okret1 = (size_t)a - (size_t)okret1 - 1;
2218     *okret2 = (size_t)a - (size_t)okret2 - 1;
2219    
2220     /* 99 cltd */
2221     *a++ = 0x99;
2222     break;
2223    
2224     case HI6_LWR:
2225     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2226     /* 05 34 f2 ff ff add $0xfffff234,%eax */
2227     *a++ = 5;
2228     *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2229     /* 83 e0 03 and $0x03,%eax */
2230     *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2231     /* 89 c3 mov %eax,%ebx */
2232     *a++ = 0x89; *a++ = 0xc3;
2233    
2234     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2235    
2236     /* ALIGNED LOAD: */
2237     /* 8b 11 mov (%ecx),%edx */
2238     *a++ = 0x8b; *a++ = 0x11;
2239    
2240     /*
2241     * CASE 0:
2242     * memory = 0x12 0x34 0x56 0x78
2243     * register after lwr: 0x78 0x56 0x34 0x12
2244     */
2245     /* 83 fb 00 cmp $0x0,%ebx */
2246     /* 75 01 jne <skip> */
2247     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2248     *a++ = 0x75; skip = a; *a++ = 0x01;
2249    
2250     /* 89 d0 mov %edx,%eax */
2251     *a++ = 0x89; *a++ = 0xd0;
2252    
2253     /* eb 00 jmp <okret> */
2254     *a++ = 0xeb; okret0 = a; *a++ = 0;
2255    
2256     *skip = (size_t)a - (size_t)skip - 1;
2257    
2258     /*
2259     * CASE 1:
2260     * memory = 0x12 0x34 0x56 0x78
2261     * register after lwr: 0x.. 0x78 0x56 0x34
2262     */
2263     /* 83 fb 01 cmp $0x1,%ebx */
2264     /* 75 01 jne <skip> */
2265     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2266     *a++ = 0x75; skip = a; *a++ = 0x01;
2267    
2268     /* c1 ea 08 shr $0x8,%edx */
2269     /* 25 00 00 00 ff and $0xff000000,%eax */
2270     /* 09 d0 or %edx,%eax */
2271     *a++ = 0xc1; *a++ = 0xea; *a++ = 0x08;
2272     *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2273     *a++ = 0x09; *a++ = 0xd0;
2274    
2275     /* eb 00 jmp <okret> */
2276     *a++ = 0xeb; okret1 = a; *a++ = 0;
2277    
2278     *skip = (size_t)a - (size_t)skip - 1;
2279    
2280     /*
2281     * CASE 2:
2282     * memory = 0x12 0x34 0x56 0x78
2283     * register after lwr: 0x.. 0x.. 0x78 0x56
2284     */
2285     /* 83 fb 02 cmp $0x2,%ebx */
2286     /* 75 01 jne <skip> */
2287     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2288     *a++ = 0x75; skip = a; *a++ = 0x01;
2289    
2290     /* c1 ea 10 shr $0x10,%edx */
2291     /* 25 00 00 ff ff and $0xffff0000,%eax */
2292     /* 09 d0 or %edx,%eax */
2293     *a++ = 0xc1; *a++ = 0xea; *a++ = 0x10;
2294     *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2295     *a++ = 0x09; *a++ = 0xd0;
2296    
2297     /* eb 00 jmp <okret> */
2298     *a++ = 0xeb; okret2 = a; *a++ = 0;
2299    
2300     *skip = (size_t)a - (size_t)skip - 1;
2301    
2302     /*
2303     * CASE 3:
2304     * memory = 0x12 0x34 0x56 0x78
2305     * register after lwr: 0x.. 0x.. 0x.. 0x78
2306     */
2307     /* c1 ea 18 shr $0x18,%edx */
2308     /* 25 00 ff ff ff and $0xffffff00,%eax */
2309     /* 09 d0 or %edx,%eax */
2310     *a++ = 0xc1; *a++ = 0xea; *a++ = 0x18;
2311     *a++ = 0x25; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2312     *a++ = 0x09; *a++ = 0xd0;
2313    
2314     /* okret: */
2315     *okret0 = (size_t)a - (size_t)okret0 - 1;
2316     *okret1 = (size_t)a - (size_t)okret1 - 1;
2317     *okret2 = (size_t)a - (size_t)okret2 - 1;
2318    
2319     /* 99 cltd */
2320     *a++ = 0x99;
2321     break;
2322    
2323     case HI6_SD:
2324 dpavlin 12 if (bigendian) {
2325     /* 0f c8 bswap %eax */
2326     /* 0f ca bswap %edx */
2327     /* 89 11 mov %edx,(%ecx) */
2328     /* 89 41 04 mov %eax,0x4(%ecx) */
2329     *a++ = 0x0f; *a++ = 0xc8;
2330     *a++ = 0x0f; *a++ = 0xca;
2331     *a++ = 0x89; *a++ = 0x11;
2332     *a++ = 0x89; *a++ = 0x41; *a++ = 0x04;
2333     } else {
2334     /* 89 01 mov %eax,(%ecx) */
2335     /* 89 51 04 mov %edx,0x4(%ecx) */
2336     *a++ = 0x89; *a++ = 0x01;
2337     *a++ = 0x89; *a++ = 0x51; *a++ = 0x04;
2338     }
2339 dpavlin 2 break;
2340     case HI6_SW:
2341 dpavlin 12 /* 0f c8 bswap %eax (big endian) */
2342     if (bigendian) {
2343     *a++ = 0x0f; *a++ = 0xc8;
2344     }
2345 dpavlin 2 /* 89 01 mov %eax,(%ecx) */
2346     *a++ = 0x89; *a++ = 0x01;
2347     break;
2348     case HI6_SH:
2349 dpavlin 12 /* 86 c4 xchg %al,%ah (big endian) */
2350     if (bigendian) {
2351     *a++ = 0x86; *a++ = 0xc4;
2352     }
2353 dpavlin 2 /* 66 89 01 mov %ax,(%ecx) */
2354     *a++ = 0x66; *a++ = 0x89; *a++ = 0x01;
2355     break;
2356     case HI6_SB:
2357     /* 88 01 mov %al,(%ecx) */
2358     *a++ = 0x88; *a++ = 0x01;
2359     break;
2360    
2361     case HI6_SWL:
2362     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2363     /* 05 34 f2 ff ff add $0xfffff234,%eax */
2364     *a++ = 5;
2365     *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2366     /* 83 e0 03 and $0x03,%eax */
2367     *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2368     /* 89 c3 mov %eax,%ebx */
2369     *a++ = 0x89; *a++ = 0xc3;
2370    
2371     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2372    
2373     /* ALIGNED LOAD: */
2374     /* 8b 11 mov (%ecx),%edx */
2375     *a++ = 0x8b; *a++ = 0x11;
2376    
2377     /*
2378     * CASE 0:
2379     * memory (edx): 0x12 0x34 0x56 0x78
2380     * register (eax): 0x89abcdef
2381     * mem after swl: 0x89 0x.. 0x.. 0x..
2382     */
2383     /* 83 fb 00 cmp $0x0,%ebx */
2384     /* 75 01 jne <skip> */
2385     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2386     *a++ = 0x75; skip = a; *a++ = 0x01;
2387    
2388     /* 81 e2 00 ff ff ff and $0xffffff00,%edx */
2389     /* c1 e8 18 shr $0x18,%eax */
2390     /* 09 d0 or %edx,%eax */
2391     *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2392     *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x18;
2393     *a++ = 0x09; *a++ = 0xd0;
2394    
2395     /* eb 00 jmp <okret> */
2396     *a++ = 0xeb; okret0 = a; *a++ = 0;
2397    
2398     *skip = (size_t)a - (size_t)skip - 1;
2399    
2400     /*
2401     * CASE 1:
2402     * memory (edx): 0x12 0x34 0x56 0x78
2403     * register (eax): 0x89abcdef
2404     * mem after swl: 0xab 0x89 0x.. 0x..
2405     */
2406     /* 83 fb 01 cmp $0x1,%ebx */
2407     /* 75 01 jne <skip> */
2408     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2409     *a++ = 0x75; skip = a; *a++ = 0x01;
2410    
2411     /* 81 e2 00 00 ff ff and $0xffff0000,%edx */
2412     /* c1 e8 10 shr $0x10,%eax */
2413     /* 09 d0 or %edx,%eax */
2414     *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2415     *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x10;
2416     *a++ = 0x09; *a++ = 0xd0;
2417    
2418     /* eb 00 jmp <okret> */
2419     *a++ = 0xeb; okret1 = a; *a++ = 0;
2420    
2421     *skip = (size_t)a - (size_t)skip - 1;
2422    
2423     /*
2424     * CASE 2:
2425     * memory (edx): 0x12 0x34 0x56 0x78
2426     * register (eax): 0x89abcdef
2427     * mem after swl: 0xcd 0xab 0x89 0x..
2428     */
2429     /* 83 fb 02 cmp $0x2,%ebx */
2430     /* 75 01 jne <skip> */
2431     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2432     *a++ = 0x75; skip = a; *a++ = 0x01;
2433    
2434     /* 81 e2 00 00 00 ff and $0xff000000,%edx */
2435     /* c1 e8 08 shr $0x08,%eax */
2436     /* 09 d0 or %edx,%eax */
2437     *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2438     *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x08;
2439     *a++ = 0x09; *a++ = 0xd0;
2440    
2441     /* eb 00 jmp <okret> */
2442     *a++ = 0xeb; okret2 = a; *a++ = 0;
2443    
2444     *skip = (size_t)a - (size_t)skip - 1;
2445    
2446     /*
2447     * CASE 3:
2448     * memory (edx): 0x12 0x34 0x56 0x78
2449     * register (eax): 0x89abcdef
2450     * mem after swl: 0xef 0xcd 0xab 0x89
2451     */
2452     /* eax = eax :-) */
2453    
2454     /* okret: */
2455     *okret0 = (size_t)a - (size_t)okret0 - 1;
2456     *okret1 = (size_t)a - (size_t)okret1 - 1;
2457     *okret2 = (size_t)a - (size_t)okret2 - 1;
2458    
2459     /* Store back to memory: */
2460     /* 89 01 mov %eax,(%ecx) */
2461     *a++ = 0x89; *a++ = 0x01;
2462     break;
2463    
2464     case HI6_SWR:
2465     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2466     /* 05 34 f2 ff ff add $0xfffff234,%eax */
2467     *a++ = 5;
2468     *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2469     /* 83 e0 03 and $0x03,%eax */
2470     *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2471     /* 89 c3 mov %eax,%ebx */
2472     *a++ = 0x89; *a++ = 0xc3;
2473    
2474     load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2475    
2476     /* ALIGNED LOAD: */
2477     /* 8b 11 mov (%ecx),%edx */
2478     *a++ = 0x8b; *a++ = 0x11;
2479    
2480     /*
2481     * CASE 0:
2482     * memory (edx): 0x12 0x34 0x56 0x78
2483     * register (eax): 0x89abcdef
2484     * mem after swr: 0xef 0xcd 0xab 0x89
2485     */
2486     /* 83 fb 00 cmp $0x0,%ebx */
2487     /* 75 01 jne <skip> */
2488     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2489     *a++ = 0x75; skip = a; *a++ = 0x01;
2490    
2491     /* eax = eax, so do nothing */
2492    
2493     /* eb 00 jmp <okret> */
2494     *a++ = 0xeb; okret0 = a; *a++ = 0;
2495    
2496     *skip = (size_t)a - (size_t)skip - 1;
2497    
2498     /*
2499     * CASE 1:
2500     * memory (edx): 0x12 0x34 0x56 0x78
2501     * register (eax): 0x89abcdef
2502     * mem after swr: 0x12 0xef 0xcd 0xab
2503     */
2504     /* 83 fb 01 cmp $0x1,%ebx */
2505     /* 75 01 jne <skip> */
2506     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2507     *a++ = 0x75; skip = a; *a++ = 0x01;
2508    
2509     /* 81 e2 ff 00 00 00 and $0x000000ff,%edx */
2510     /* c1 e0 08 shl $0x08,%eax */
2511     /* 09 d0 or %edx,%eax */
2512     *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2513     *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x08;
2514     *a++ = 0x09; *a++ = 0xd0;
2515    
2516     /* eb 00 jmp <okret> */
2517     *a++ = 0xeb; okret1 = a; *a++ = 0;
2518    
2519     *skip = (size_t)a - (size_t)skip - 1;
2520    
2521     /*
2522     * CASE 2:
2523     * memory (edx): 0x12 0x34 0x56 0x78
2524     * register (eax): 0x89abcdef
2525     * mem after swr: 0x12 0x34 0xef 0xcd
2526     */
2527     /* 83 fb 02 cmp $0x2,%ebx */
2528     /* 75 01 jne <skip> */
2529     *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2530     *a++ = 0x75; skip = a; *a++ = 0x01;
2531    
2532     /* 81 e2 ff ff 00 00 and $0x0000ffff,%edx */
2533     /* c1 e0 10 shl $0x10,%eax */
2534     /* 09 d0 or %edx,%eax */
2535     *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2536     *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x10;
2537     *a++ = 0x09; *a++ = 0xd0;
2538    
2539     /* eb 00 jmp <okret> */
2540     *a++ = 0xeb; okret2 = a; *a++ = 0;
2541    
2542     *skip = (size_t)a - (size_t)skip - 1;
2543    
2544     /*
2545     * CASE 3:
2546     * memory (edx): 0x12 0x34 0x56 0x78
2547     * register (eax): 0x89abcdef
2548     * mem after swr: 0x12 0x34 0x56 0xef
2549     */
2550     /* 81 e2 ff ff ff 00 and $0x00ffffff,%edx */
2551     /* c1 e0 18 shl $0x18,%eax */
2552     /* 09 d0 or %edx,%eax */
2553     *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2554     *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x18;
2555     *a++ = 0x09; *a++ = 0xd0;
2556    
2557    
2558     /* okret: */
2559     *okret0 = (size_t)a - (size_t)okret0 - 1;
2560     *okret1 = (size_t)a - (size_t)okret1 - 1;
2561     *okret2 = (size_t)a - (size_t)okret2 - 1;
2562    
2563     /* Store back to memory: */
2564     /* 89 01 mov %eax,(%ecx) */
2565     *a++ = 0x89; *a++ = 0x01;
2566     break;
2567    
2568     default:
2569     bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2570     }
2571    
2572     if (load && rt != 0)
2573     store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2574    
2575     *addrp = a;
2576     bintrans_write_pc_inc(addrp);
2577     return 1;
2578     }
2579    
2580    
2581     /*
2582     * bintrans_write_instruction__tlb_rfe_etc():
2583     */
2584     static int bintrans_write_instruction__tlb_rfe_etc(unsigned char **addrp,
2585     int itype)
2586     {
2587     unsigned char *a;
2588     int ofs;
2589    
2590     switch (itype) {
2591     case CALL_TLBP:
2592     case CALL_TLBR:
2593     case CALL_TLBWR:
2594     case CALL_TLBWI:
2595     case CALL_RFE:
2596     case CALL_ERET:
2597     case CALL_SYSCALL:
2598     case CALL_BREAK:
2599     break;
2600     default:
2601     return 0;
2602     }
2603    
2604     a = *addrp;
2605    
2606     /* Put back PC into the cpu struct, both as pc and pc_last */
2607     *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc&255;
2608     *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2609     *a++ = (ofs_pc>>24)&255; /* mov %edi,pc(%esi) */
2610    
2611     *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc_last&255;
2612     *a++ = (ofs_pc_last>>8)&255; *a++ = (ofs_pc_last>>16)&255;
2613     *a++ = (ofs_pc_last>>24)&255; /* mov %edi,pc_last(%esi) */
2614    
2615     /* ... and make sure that the high 32 bits are ALSO in pc_last: */
2616     /* 8b 86 38 12 00 00 mov 0x1238(%esi),%eax */
2617     ofs = ofs_pc + 4;
2618     *a++ = 0x8b; *a++ = 0x86; *a++ = ofs&255;
2619     *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2620     *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2621    
2622     /* 89 86 34 12 00 00 mov %eax,0x1234(%esi) */
2623     ofs = ofs_pc_last + 4;
2624     *a++ = 0x89; *a++ = 0x86; *a++ = ofs&255;
2625     *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2626     *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2627    
2628     switch (itype) {
2629     case CALL_TLBP:
2630     case CALL_TLBR:
2631     /* push readflag */
2632     *a++ = 0x6a; *a++ = (itype == CALL_TLBR);
2633     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbpr) - (size_t)&dummy_cpu;
2634     break;
2635     case CALL_TLBWR:
2636     case CALL_TLBWI:
2637     /* push randomflag */
2638     *a++ = 0x6a; *a++ = (itype == CALL_TLBWR);
2639     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbwri) - (size_t)&dummy_cpu;
2640     break;
2641     case CALL_SYSCALL:
2642     case CALL_BREAK:
2643     /* push randomflag */
2644     *a++ = 0x6a; *a++ = (itype == CALL_BREAK? EXCEPTION_BP : EXCEPTION_SYS);
2645     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_simple_exception) - (size_t)&dummy_cpu;
2646     break;
2647     case CALL_RFE:
2648     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_rfe) - (size_t)&dummy_cpu;
2649     break;
2650     case CALL_ERET:
2651     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_eret) - (size_t)&dummy_cpu;
2652     break;
2653     }
2654    
2655     /* push cpu (esi) */
2656     *a++ = 0x56;
2657    
2658     /* eax = points to the right function */
2659     *a++ = 0x8b; *a++ = 0x86;
2660     *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2661    
2662     /* ff d0 call *%eax */
2663     *a++ = 0xff; *a++ = 0xd0;
2664    
2665     switch (itype) {
2666     case CALL_RFE:
2667     case CALL_ERET:
2668     /* 83 c4 04 add $4,%esp */
2669     *a++ = 0x83; *a++ = 0xc4; *a++ = 4;
2670     break;
2671     default:
2672     /* 83 c4 08 add $8,%esp */
2673     *a++ = 0x83; *a++ = 0xc4; *a++ = 8;
2674     break;
2675     }
2676    
2677     /* Load PC from the cpu struct. */
2678     *a++ = 0x8b; *a++ = 0xbe; *a++ = ofs_pc&255;
2679     *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2680     *a++ = (ofs_pc>>24)&255; /* mov pc(%esi),%edi */
2681    
2682     *addrp = a;
2683    
2684     switch (itype) {
2685     case CALL_ERET:
2686     case CALL_SYSCALL:
2687     case CALL_BREAK:
2688     break;
2689     default:
2690     bintrans_write_pc_inc(addrp);
2691     }
2692    
2693     return 1;
2694     }
2695    
2696    
2697     /*
2698     * bintrans_backend_init():
2699     *
2700     * This is neccessary for broken GCC 2.x. (For GCC 3.x, this wouldn't be
2701     * neccessary, and the old code would have worked.)
2702     */
2703     static void bintrans_backend_init(void)
2704     {
2705     int size;
2706     unsigned char *p;
2707    
2708    
2709     /* "runchunk": */
2710     size = 64; /* NOTE: This MUST be enough, or we fail */
2711     p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2712     PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2713    
2714     /* If mmap() failed, try malloc(): */
2715     if (p == NULL) {
2716     p = malloc(size);
2717     if (p == NULL) {
2718     fprintf(stderr, "bintrans_backend_init():"
2719     " out of memory\n");
2720     exit(1);
2721     }
2722     }
2723    
2724     bintrans_runchunk = (void *)p;
2725    
2726     *p++ = 0x57; /* push %edi */
2727     *p++ = 0x56; /* push %esi */
2728     *p++ = 0x55; /* push %ebp */
2729     *p++ = 0x53; /* push %ebx */
2730    
2731     /*
2732     * In all translated code, esi points to the cpu struct, and
2733     * ebp is the nr of executed (translated) instructions.
2734     */
2735    
2736     /* 0=ebx, 4=ebp, 8=esi, 0xc=edi, 0x10=retaddr, 0x14=arg0, 0x18=arg1 */
2737    
2738     /* mov 0x8(%esp,1),%esi */
2739     *p++ = 0x8b; *p++ = 0x74; *p++ = 0x24; *p++ = 0x14;
2740    
2741     /* mov nr_instr(%esi),%ebp */
2742     *p++ = 0x8b; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2743     *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2744    
2745     /* mov pc(%esi),%edi */
2746     *p++ = 0x8b; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2747     *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2748    
2749     /* call *0x18(%esp,1) */
2750     *p++ = 0xff; *p++ = 0x54; *p++ = 0x24; *p++ = 0x18;
2751    
2752     /* mov %ebp,0x1234(%esi) */
2753     *p++ = 0x89; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2754     *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2755    
2756     /* mov %edi,pc(%esi) */
2757     *p++ = 0x89; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2758     *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2759    
2760     *p++ = 0x5b; /* pop %ebx */
2761     *p++ = 0x5d; /* pop %ebp */
2762     *p++ = 0x5e; /* pop %esi */
2763     *p++ = 0x5f; /* pop %edi */
2764     *p++ = 0xc3; /* ret */
2765    
2766    
2767    
2768     /* "jump_to_32bit_pc": */
2769     size = 128; /* NOTE: This MUST be enough, or we fail */
2770     p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2771     PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2772    
2773     /* If mmap() failed, try malloc(): */
2774     if (p == NULL) {
2775     p = malloc(size);
2776     if (p == NULL) {
2777     fprintf(stderr, "bintrans_backend_init():"
2778     " out of memory\n");
2779     exit(1);
2780     }
2781     }
2782    
2783     bintrans_jump_to_32bit_pc = (void *)p;
2784    
2785     /* Don't execute too many instructions. */
2786     /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
2787     /* 7c 01 jl <okk> */
2788     /* c3 ret */
2789     *p++ = 0x81; *p++ = 0xfd; *p++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
2790     *p++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *p++ = 0; *p++ = 0;
2791     *p++ = 0x7c; *p++ = 0x01;
2792     *p++ = 0xc3;
2793    
2794     /*
2795     * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
2796     *
2797     * 89 c3 mov %eax,%ebx
2798     * c1 eb 14 shr $20,%ebx
2799     * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2800     */
2801     *p++ = 0x89; *p++ = 0xc3;
2802     *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x14;
2803     *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2804    
2805     /*
2806     * ecx = vaddr_to_hostaddr_table0
2807     *
2808     * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
2809     */
2810     *p++ = 0x8b; *p++ = 0x8e;
2811     *p++ = ofs_tabl0 & 255; *p++ = (ofs_tabl0 >> 8) & 255;
2812     *p++ = (ofs_tabl0 >> 16) & 255; *p++ = (ofs_tabl0 >> 24) & 255;
2813    
2814     /*
2815     * ecx = vaddr_to_hostaddr_table0[a]
2816     *
2817     * 8b 0c 19 mov (%ecx,%ebx),%ecx
2818     */
2819     *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x19;
2820    
2821     /*
2822     * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
2823     *
2824     * 89 c3 mov %eax,%ebx
2825     * c1 eb 0a shr $10,%ebx
2826     * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2827     */
2828     *p++ = 0x89; *p++ = 0xc3;
2829     *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0a;
2830     *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2831    
2832     /*
2833     * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
2834     *
2835     * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
2836     */
2837     *p++ = 0x8b; *p++ = 0x8c; *p++ = 0x19; *p++ = ofs_chunks & 255;
2838     *p++ = (ofs_chunks >> 8) & 255; *p++ = (ofs_chunks >> 16) & 255;
2839     *p++ = (ofs_chunks >> 24) & 255;
2840    
2841     /*
2842     * ecx = NULL? Then return with failure.
2843     *
2844     * 83 f9 00 cmp $0x0,%ecx
2845     * 75 01 jne <okzzz>
2846     */
2847     *p++ = 0x83; *p++ = 0xf9; *p++ = 0x00;
2848     *p++ = 0x75; *p++ = 0x01;
2849     *p++ = 0xc3; /* TODO: failure? */
2850    
2851     /*
2852     * 25 fc 0f 00 00 and $0xffc,%eax
2853     * 01 c1 add %eax,%ecx
2854     *
2855     * 8b 01 mov (%ecx),%eax
2856     *
2857     * 83 f8 00 cmp $0x0,%eax
2858     * 75 01 jne <ok>
2859     * c3 ret
2860     */
2861     *p++ = 0x25; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2862     *p++ = 0x01; *p++ = 0xc1;
2863    
2864     *p++ = 0x8b; *p++ = 0x01;
2865    
2866     *p++ = 0x83; *p++ = 0xf8; *p++ = 0x00;
2867     *p++ = 0x75; *p++ = 0x01;
2868     *p++ = 0xc3; /* TODO: failure? */
2869    
2870     /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
2871     /* ff e0 jmp *%eax */
2872     *p++ = 0x03; *p++ = 0x86; *p++ = ofs_chunkbase & 255;
2873     *p++ = (ofs_chunkbase >> 8) & 255; *p++ = (ofs_chunkbase >> 16) & 255;
2874     *p++ = (ofs_chunkbase >> 24) & 255;
2875     *p++ = 0xff; *p++ = 0xe0;
2876    
2877    
2878    
2879 dpavlin 10 /* "load_32bit": */
2880 dpavlin 2 size = 48; /* NOTE: This MUST be enough, or we fail */
2881     p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2882     PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2883    
2884     /* If mmap() failed, try malloc(): */
2885     if (p == NULL) {
2886     p = malloc(size);
2887     if (p == NULL) {
2888     fprintf(stderr, "bintrans_backend_init():"
2889     " out of memory\n");
2890     exit(1);
2891     }
2892     }
2893    
2894 dpavlin 10 bintrans_load_32bit = (void *)p;
2895 dpavlin 2
2896 dpavlin 12 /* ebx = vaddr >> 12; */
2897     *p++ = 0x89; *p++ = 0xc3; /* mov %eax, %ebx */
2898     *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0c; /* shr $12, %ebx */
2899 dpavlin 2
2900 dpavlin 12 /* ecx = cpu->cd.mips.host_load */
2901     *p++ = 0x8b; *p++ = 0x8e; *p++ = ofs_h_l & 255;
2902     *p++ = (ofs_h_l >> 8) & 255;
2903     *p++ = (ofs_h_l >> 16) & 255; *p++ = (ofs_h_l >> 24) & 255;
2904 dpavlin 2
2905 dpavlin 12 /* ecx = host_load[a] */
2906     *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
2907 dpavlin 2
2908 dpavlin 10 /* ret */
2909     *p++ = 0xc3;
2910    
2911    
2912    
2913     /* "store_32bit": */
2914     size = 48; /* NOTE: This MUST be enough, or we fail */
2915     p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2916     PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2917    
2918     /* If mmap() failed, try malloc(): */
2919     if (p == NULL) {
2920     p = malloc(size);
2921     if (p == NULL) {
2922     fprintf(stderr, "bintrans_backend_init():"
2923     " out of memory\n");
2924     exit(1);
2925     }
2926     }
2927    
2928     bintrans_store_32bit = (void *)p;
2929    
2930 dpavlin 12 /* ebx = vaddr >> 12; */
2931     *p++ = 0x89; *p++ = 0xc3; /* mov %eax, %ebx */
2932     *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0c; /* shr $12, %ebx */
2933 dpavlin 10
2934 dpavlin 12 /* ecx = cpu->cd.mips.host_store */
2935     *p++ = 0x8b; *p++ = 0x8e; *p++ = ofs_h_s & 255;
2936     *p++ = (ofs_h_s >> 8) & 255;
2937     *p++ = (ofs_h_s >> 16) & 255; *p++ = (ofs_h_s >> 24) & 255;
2938 dpavlin 10
2939 dpavlin 12 /* ecx = host_store[a] */
2940     *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x99; /* mov (%ecx,%ebx,4),%ecx */
2941 dpavlin 2
2942     /* ret */
2943     *p++ = 0xc3;
2944     }
2945    

  ViewVC Help
Powered by ViewVC 1.1.26