/[gxemul]/trunk/src/bintrans_alpha.c
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/src/bintrans_alpha.c

Parent Directory Parent Directory | Revision Log Revision Log


Revision 10 - (hide annotations)
Mon Oct 8 16:18:27 2007 UTC (16 years, 6 months ago) by dpavlin
File MIME type: text/plain
File size: 84728 byte(s)
++ trunk/HISTORY	(local)
$Id: HISTORY,v 1.815 2005/06/27 23:04:35 debug Exp $
20050617	Experimenting some more with netbooting OpenBSD/sgi. Adding
		a hack which allows emulated ethernet networks to be
		distributed across multiple emulator processes.
20050618	Minor updates (documentation, dummy YAMON emulation, etc).
20050620	strcpy/strcat -> strlcpy/strlcat updates.
		Some more progress on evbmips (Malta).
20050621	Adding a section to doc/configfiles.html about ethernet
		emulation across multiple hosts.
		Beginning the work on the ARM translation engine (using the
		dynamic-but-not-binary translation method).
		Fixing a bintrans bug: 0x9fc00000 should always be treated as
		PROM area, just as 0xbfc00000 is.
		Minor progress on Malta emulation (the PCI-ISA bus).
20050622	NetBSD/evbmips can now be installed (using another emulated
		machine) and run (including userland and so on). :-)
		Spliting up the bintrans haddr_entry field into two (one for
		read, one for write). Probably not much of a speed increase,
		though.
		Updating some NetBSD 2.0 -> 2.0.2 in the documentation.
20050623	Minor updates (documentation, the TODO file, etc).
		gzipped kernels are now always automagically gunzipped when
		loaded.
20050624	Adding a dummy Playstation Portable (PSP) mode, just barely
		enough to run Hello World (in weird colors :-).
		Removing the -b command line option; old bintrans is enabled
		by default instead. It makes more sense.
		Trying to finally fix the non-working performance measurement
		thing (instr/second etc).
20050625	Continuing on the essential basics for ARM emulation. Two
		instructions seem to work, a branch and a simple "mov". (The
		mov arguments are not correct yet.) Performance is definitely
		reasonable.
		Various other minor updates.
		Adding the ARM "bl" instruction.
		Adding support for combining multiple ARM instructions into one
		function call. ("mov" + "mov" is the only one implemented so
		far, but it seems to work.)
		Cleaning up some IP32 interrupt things (crime/mace); disabling
		the PS/2 keyboard controller on IP32, so that NetBSD/sgimips
		boots into userland again.
20050626	Finally! NetBSD/sgimips netboots. Adding instructions to
		doc/guestoses.html on how to set up an nfs server etc.
		Various other minor fixes.
		Playstation Portable ".pbp" files can now be used directly.
		(The ELF part of the .pbp is extracted transparently.)
		Converting some sprintf -> snprintf.
		Adding some more instructions to the ARM disassembler.
20050627	More ARM updates. Adding some simple ldr(b), str(b),
		cmps, and conditional branch instructions, enough to run
		a simple Hello World program.
		All ARM instructions are now inlined/generated for all possible
		condition codes.
		Adding add and sub, and more load/store instructions.
		Removing dummy files: cpu_alpha.c, cpu_hppa.c, and cpu_sparc.c.
		Some minor documentation updates; preparing for a 0.3.4
		release. Updating some URLs.

==============  RELEASE 0.3.4  ==============


1 dpavlin 2 /*
2     * Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
3     *
4     * Redistribution and use in source and binary forms, with or without
5     * modification, are permitted provided that the following conditions are met:
6     *
7     * 1. Redistributions of source code must retain the above copyright
8     * notice, this list of conditions and the following disclaimer.
9     * 2. Redistributions in binary form must reproduce the above copyright
10     * notice, this list of conditions and the following disclaimer in the
11     * documentation and/or other materials provided with the distribution.
12     * 3. The name of the author may not be used to endorse or promote products
13     * derived from this software without specific prior written permission.
14     *
15     * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
16     * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17     * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18     * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19     * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20     * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21     * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22     * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24     * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25     * SUCH DAMAGE.
26     *
27     *
28 dpavlin 10 * $Id: bintrans_alpha.c,v 1.119 2005/06/22 10:12:25 debug Exp $
29 dpavlin 2 *
30     * Alpha specific code for dynamic binary translation.
31     *
32     * See bintrans.c for more information. Included from bintrans.c.
33     *
34     *
35     * Some Alpha registers that are reasonable to use:
36     *
37     * t5..t7 6..8 3
38     * s0..s6 9..15 7
39     * a1..a5 17..21 5
40     * t8..t11 22..25 4
41     *
42 dpavlin 4 * These can be "mapped" to MIPS registers in the translated code, except a0
43     * which points to the cpu struct, and t0..t4 (or so) which are used by the
44     * translated code as temporaries.
45 dpavlin 2 *
46 dpavlin 4 * 3 + 7 + 5 + 4 = 19 available registers. Of course, all (except s0..s6) must
47     * be saved when calling external functions, such as when calling tlbp and
48     * other external functions.
49 dpavlin 2 *
50 dpavlin 4 * Which are the 19 most commonly used MIPS registers? (This will include the
51     * pc, and the "current number of executed translated instructions.)
52 dpavlin 2 *
53     * The current allocation is as follows:
54     *
55     * Alpha: MIPS:
56     * ------ -----
57     *
58     * t5 pc (64-bit)
59     * t6 bintrans_instructions_executed (32-bit int)
60     * t7 a0 (mips register 4) (64-bit)
61     * t8 a1 (mips register 5) (64-bit)
62     * t9 s0 (mips register 16) (64-bit)
63     * t10 table0 cached (for load/store)
64     * t11 v0 (mips register 2) (64-bit)
65     * s0 delay_slot (32-bit int)
66     * s1 delay_jmpaddr (64-bit)
67     * s2 sp (mips register 29) (64-bit)
68     * s3 ra (mips register 31) (64-bit)
69     * s4 t0 (mips register 8) (64-bit)
70     * s5 t1 (mips register 9) (64-bit)
71     * s6 t2 (mips register 10) (64-bit)
72     */
73    
74     #define MIPSREG_PC -3
75     #define MIPSREG_DELAY_SLOT -2
76     #define MIPSREG_DELAY_JMPADDR -1
77    
78     #define ALPHA_T0 1
79     #define ALPHA_T1 2
80     #define ALPHA_T2 3
81     #define ALPHA_T3 4
82     #define ALPHA_T4 5
83     #define ALPHA_T5 6
84     #define ALPHA_T6 7
85     #define ALPHA_T7 8
86     #define ALPHA_S0 9
87     #define ALPHA_S1 10
88     #define ALPHA_S2 11
89     #define ALPHA_S3 12
90     #define ALPHA_S4 13
91     #define ALPHA_S5 14
92     #define ALPHA_S6 15
93     #define ALPHA_A0 16
94     #define ALPHA_A1 17
95     #define ALPHA_A2 18
96     #define ALPHA_A3 19
97     #define ALPHA_A4 20
98     #define ALPHA_A5 21
99     #define ALPHA_T8 22
100     #define ALPHA_T9 23
101     #define ALPHA_T10 24
102     #define ALPHA_T11 25
103     #define ALPHA_ZERO 31
104    
105     static int map_MIPS_to_Alpha[32] = {
106     ALPHA_ZERO, -1, ALPHA_T11, -1, /* 0 .. 3 */
107     ALPHA_T7, ALPHA_T8, -1, -1, /* 4 .. 7 */
108     ALPHA_S4, ALPHA_S5, ALPHA_S6, -1, /* 8 .. 11 */
109     -1, -1, -1, -1, /* 12 .. 15 */
110     ALPHA_T9, -1, -1, -1, /* 16 .. 19 */
111     -1, -1, -1, -1, /* 20 .. 23 */
112     -1, -1, -1, -1, /* 24 .. 27 */
113     -1, ALPHA_S2, -1, ALPHA_S3, /* 28 .. 31 */
114     };
115    
116    
117     struct cpu dummy_cpu;
118     struct mips_coproc dummy_coproc;
119     struct vth32_table dummy_vth32_table;
120    
121     unsigned char bintrans_alpha_imb[32] = {
122     0x86, 0x00, 0x00, 0x00, /* imb */
123     0x01, 0x80, 0xfa, 0x6b, /* ret */
124     0x1f, 0x04, 0xff, 0x47, /* nop */
125     0x00, 0x00, 0xfe, 0x2e, /* unop */
126     0x1f, 0x04, 0xff, 0x47, /* nop */
127     0x00, 0x00, 0xfe, 0x2e, /* unop */
128     0x1f, 0x04, 0xff, 0x47, /* nop */
129     0x00, 0x00, 0xfe, 0x2e /* unop */
130     };
131    
132    
133     /*
134     * bintrans_host_cacheinvalidate()
135     *
136     * Invalidate the host's instruction cache. On Alpha, we do this by
137     * executing an imb instruction.
138     *
139     * NOTE: A simple asm("imb"); would be enough here, but not all
140     * compilers have such simple constructs, so an entire function has to
141     * be written as bintrans_alpha_imb[] above.
142     */
143     static void bintrans_host_cacheinvalidate(unsigned char *p, size_t len)
144     {
145     /* Long form of ``asm("imb");'' */
146    
147     void (*f)(void);
148     f = (void *)&bintrans_alpha_imb[0];
149     f();
150     }
151    
152    
153     /*
154     * lda sp,-128(sp) some margin
155     * stq ra,0(sp)
156     * stq s0,8(sp)
157     * stq s1,16(sp)
158     * stq s2,24(sp)
159     * stq s3,32(sp)
160     * stq s4,40(sp)
161     * stq s5,48(sp)
162     * stq s6,56(sp)
163     *
164     * jsr ra,(a1),<back>
165     * back:
166     *
167     * ldq ra,0(sp)
168     * ldq s0,8(sp)
169     * ldq s1,16(sp)
170     * ldq s2,24(sp)
171     * ldq s3,32(sp)
172     * ldq s4,40(sp)
173     * ldq s5,48(sp)
174     * ldq s6,56(sp)
175     * lda sp,128(sp)
176     * ret
177     */
178     /* note: offsetof (in stdarg.h) could possibly be used, but I'm not sure
179     if it will take care of the compiler problems... */
180     #define ofs_pc (((size_t)&dummy_cpu.pc) - ((size_t)&dummy_cpu))
181     #define ofs_pc_last (((size_t)&dummy_cpu.cd.mips.pc_last) - ((size_t)&dummy_cpu))
182     #define ofs_n (((size_t)&dummy_cpu.cd.mips.bintrans_instructions_executed) - ((size_t)&dummy_cpu))
183     #define ofs_ds (((size_t)&dummy_cpu.cd.mips.delay_slot) - ((size_t)&dummy_cpu))
184     #define ofs_ja (((size_t)&dummy_cpu.cd.mips.delay_jmpaddr) - ((size_t)&dummy_cpu))
185     #define ofs_sp (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_SP]) - ((size_t)&dummy_cpu))
186     #define ofs_ra (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_RA]) - ((size_t)&dummy_cpu))
187     #define ofs_a0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_A0]) - ((size_t)&dummy_cpu))
188     #define ofs_a1 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_A1]) - ((size_t)&dummy_cpu))
189     #define ofs_t0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T0]) - ((size_t)&dummy_cpu))
190     #define ofs_t1 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T1]) - ((size_t)&dummy_cpu))
191     #define ofs_t2 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T2]) - ((size_t)&dummy_cpu))
192     #define ofs_v0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_V0]) - ((size_t)&dummy_cpu))
193     #define ofs_s0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_S0]) - ((size_t)&dummy_cpu))
194     #define ofs_tbl0 (((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - ((size_t)&dummy_cpu))
195     #define ofs_c0 ((size_t)&dummy_vth32_table.bintrans_chunks[0] - (size_t)&dummy_vth32_table)
196     #define ofs_cb (((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu)
197    
198    
199 dpavlin 10 static uint32_t bintrans_alpha_load_32bit[18] = {
200 dpavlin 2 /*
201     * t1 = 1023;
202     * t2 = ((a1 >> 22) & t1) * sizeof(void *);
203     * t3 = ((a1 >> 12) & t1) * sizeof(void *);
204     * t1 = a1 & 4095;
205     *
206     * f8 1f 5f 20 lda t1,1023 * 8
207     * 83 76 22 4a srl a1,19,t2
208     * 84 36 21 4a srl a1, 9,t3
209     * 03 00 62 44 and t2,t1,t2
210     */
211     0x205f1ff8,
212     0x4a227683,
213     0x4a213684,
214     0x44620003,
215    
216     /*
217     * t10 is vaddr_to_hostaddr_table0
218     *
219     * a3 = tbl0[t2] (load entry from tbl0)
220     * 12 04 03 43 addq t10,t2,a2
221     */
222     0x43030412,
223    
224     /* 04 00 82 44 and t3,t1,t3 */
225     0x44820004,
226    
227     /* 00 00 72 a6 ldq a3,0(a2) */
228     0xa6720000,
229    
230 dpavlin 10 /* 24 37 80 48 sll t3,0x1,t3 */
231     0x48803724,
232    
233 dpavlin 2 /* ff 0f 5f 20 lda t1,4095 */
234     0x205f0fff,
235    
236     /*
237     * a3 = tbl1[t3] (load entry from tbl1 (which is a3))
238     * 13 04 64 42 addq a3,t3,a3
239     */
240     0x42640413,
241    
242     /* 02 00 22 46 and a1,t1,t1 */
243     0x46220002,
244    
245     /* 00 00 73 a6 ldq a3,0(a3) */
246     0xa6730000,
247    
248     /* NULL? Then return failure at once. */
249 dpavlin 4 /* beq a3, return */
250 dpavlin 10 0xe6600002,
251 dpavlin 2
252 dpavlin 10 /* The rest of the load/store code was written with t3 as the address. */
253 dpavlin 2
254 dpavlin 10 /* Add the offset within the page: */
255     /* 04 04 62 42 addq a3,t1,t3 */
256     0x42620404,
257 dpavlin 2
258 dpavlin 10 0x6be50000, /* jmp (t4) */
259    
260     /* return: */
261     0x243f0000 | (BINTRANS_DONT_RUN_NEXT >> 16), /* ldah t0,256 */
262     0x44270407, /* or t0,t6,t6 */
263     0x6bfa8001 /* ret */
264     };
265    
266     static uint32_t bintrans_alpha_store_32bit[18] = {
267     /*
268     * t1 = 1023;
269     * t2 = ((a1 >> 22) & t1) * sizeof(void *);
270     * t3 = ((a1 >> 12) & t1) * sizeof(void *);
271     * t1 = a1 & 4095;
272     *
273     * f8 1f 5f 20 lda t1,1023 * 8
274     * 83 76 22 4a srl a1,19,t2
275     * 84 36 21 4a srl a1, 9,t3
276     * 03 00 62 44 and t2,t1,t2
277     */
278     0x205f1ff8,
279     0x4a227683,
280     0x4a213684,
281     0x44620003,
282    
283     /*
284     * t10 is vaddr_to_hostaddr_table0
285     *
286     * a3 = tbl0[t2] (load entry from tbl0)
287     * 12 04 03 43 addq t10,t2,a2
288     */
289     0x43030412,
290    
291     /* 04 00 82 44 and t3,t1,t3 */
292     0x44820004,
293    
294     /* 00 00 72 a6 ldq a3,0(a2) */
295     0xa6720000,
296    
297     /* 24 37 80 48 sll t3,0x1,t3 */
298     0x48803724,
299    
300     /* ff 0f 5f 20 lda t1,4095 */
301     0x205f0fff,
302    
303     /*
304     * a3 = tbl1[t3] (load entry from tbl1 (which is a3))
305     * 13 04 64 42 addq a3,t3,a3
306     */
307     0x42640413,
308    
309     /* 02 00 22 46 and a1,t1,t1 */
310     0x46220002,
311    
312     /* 00 00 73 a6 ldq a3,8(a3) */
313     0xa6730008,
314    
315     /* NULL? Then return failure at once. */
316     /* beq a3, return */
317     0xe6600002,
318    
319 dpavlin 2 /* The rest of the load/store code was written with t3 as the address. */
320    
321     /* Add the offset within the page: */
322     /* 04 04 62 42 addq a3,t1,t3 */
323     0x42620404,
324    
325 dpavlin 4 0x6be50000, /* jmp (t4) */
326    
327     /* return: */
328     0x243f0000 | (BINTRANS_DONT_RUN_NEXT >> 16), /* ldah t0,256 */
329     0x44270407, /* or t0,t6,t6 */
330     0x6bfa8001 /* ret */
331 dpavlin 2 };
332    
333     static void (*bintrans_runchunk)(struct cpu *, unsigned char *);
334    
335     static void (*bintrans_jump_to_32bit_pc)(struct cpu *);
336    
337 dpavlin 10 static void (*bintrans_load_32bit)
338     (struct cpu *) = (void *)bintrans_alpha_load_32bit;
339 dpavlin 2
340 dpavlin 10 static void (*bintrans_store_32bit)
341     (struct cpu *) = (void *)bintrans_alpha_store_32bit;
342 dpavlin 2
343 dpavlin 10
344 dpavlin 2 /*
345     * bintrans_write_quickjump():
346     */
347     static void bintrans_write_quickjump(struct memory *mem,
348     unsigned char *quickjump_code, uint32_t chunkoffset)
349     {
350     int ofs;
351     uint64_t alpha_addr = chunkoffset +
352     (size_t)mem->translation_code_chunk_space;
353     uint32_t *a = (uint32_t *)quickjump_code;
354    
355     ofs = (alpha_addr - ((size_t)a+4)) / 4;
356    
357     /* printf("chunkoffset=%i, %016llx %016llx %i\n",
358     chunkoffset, (long long)alpha_addr, (long long)a, ofs); */
359    
360     if (ofs > -0xfffff && ofs < 0xfffff) {
361     *a++ = 0xc3e00000 | (ofs & 0x1fffff); /* br <chunk> */
362     }
363     }
364    
365    
366     /*
367     * bintrans_write_chunkreturn():
368     */
369     static void bintrans_write_chunkreturn(unsigned char **addrp)
370     {
371     uint32_t *a = (uint32_t *) *addrp;
372     *a++ = 0x6bfa8001; /* ret */
373     *addrp = (unsigned char *) a;
374     }
375    
376    
377     /*
378     * bintrans_write_chunkreturn_fail():
379     */
380     static void bintrans_write_chunkreturn_fail(unsigned char **addrp)
381     {
382     uint32_t *a = (uint32_t *) *addrp;
383     /* 00 01 3f 24 ldah t0,256 */
384     /* 07 04 27 44 or t0,t6,t6 */
385     *a++ = 0x243f0000 | (BINTRANS_DONT_RUN_NEXT >> 16);
386     *a++ = 0x44270407;
387     *a++ = 0x6bfa8001; /* ret */
388     *addrp = (unsigned char *) a;
389     }
390    
391    
392     /*
393     * bintrans_move_MIPS_reg_into_Alpha_reg():
394     */
395     static void bintrans_move_MIPS_reg_into_Alpha_reg(unsigned char **addrp, int mipsreg, int alphareg)
396     {
397     uint32_t *a = (uint32_t *) *addrp;
398     int ofs, alpha_mips_reg;
399    
400     switch (mipsreg) {
401     case MIPSREG_PC:
402     /* addq t5,0,alphareg */
403     *a++ = 0x40c01400 | alphareg;
404     break;
405     case MIPSREG_DELAY_SLOT:
406     /* addq s0,0,alphareg */
407     *a++ = 0x41201400 | alphareg;
408     break;
409     case MIPSREG_DELAY_JMPADDR:
410     /* addq s1,0,alphareg */
411     *a++ = 0x41401400 | alphareg;
412     break;
413     default:
414     alpha_mips_reg = map_MIPS_to_Alpha[mipsreg];
415     if (alpha_mips_reg < 0) {
416     ofs = ((size_t)&dummy_cpu.cd.mips.gpr[mipsreg]) - (size_t)&dummy_cpu;
417     /* ldq alphareg,gpr[mipsreg](a0) */
418     *a++ = 0xa4100000 | (alphareg << 21) | ofs;
419     } else {
420     /* addq alpha_mips_reg,0,alphareg */
421     *a++ = 0x40001400 | (alpha_mips_reg << 21) | alphareg;
422     }
423     }
424     *addrp = (unsigned char *) a;
425     }
426    
427    
428     /*
429     * bintrans_move_Alpha_reg_into_MIPS_reg():
430     */
431     static void bintrans_move_Alpha_reg_into_MIPS_reg(unsigned char **addrp, int alphareg, int mipsreg)
432     {
433     uint32_t *a = (uint32_t *) *addrp;
434     int ofs, alpha_mips_reg;
435    
436     switch (mipsreg) {
437     case MIPSREG_PC:
438     /* addq alphareg,0,t5 */
439     *a++ = 0x40001406 | (alphareg << 21);
440     break;
441     case MIPSREG_DELAY_SLOT:
442     /* addq alphareg,0,s0 */
443     *a++ = 0x40001409 | (alphareg << 21);
444     break;
445     case MIPSREG_DELAY_JMPADDR:
446     /* addq alphareg,0,s1 */
447     *a++ = 0x4000140a | (alphareg << 21);
448     break;
449     case 0: /* the zero register */
450     break;
451     default:
452     alpha_mips_reg = map_MIPS_to_Alpha[mipsreg];
453     if (alpha_mips_reg < 0) {
454     /* stq alphareg,gpr[mipsreg](a0) */
455     ofs = ((size_t)&dummy_cpu.cd.mips.gpr[mipsreg]) - (size_t)&dummy_cpu;
456     *a++ = 0xb4100000 | (alphareg << 21) | ofs;
457     } else {
458     /* addq alphareg,0,alpha_mips_reg */
459     *a++ = 0x40001400 | (alphareg << 21) | alpha_mips_reg;
460     }
461     }
462     *addrp = (unsigned char *) a;
463     }
464    
465    
466     /*
467     * bintrans_write_pc_inc():
468     */
469     static void bintrans_write_pc_inc(unsigned char **addrp)
470     {
471     uint32_t *a = (uint32_t *) *addrp;
472    
473     /* lda t6,1(t6) */
474     *a++ = 0x20e70001;
475    
476     /* lda t5,4(t5) */
477     *a++ = 0x20c60004;
478    
479     *addrp = (unsigned char *) a;
480     }
481    
482    
483     /*
484     * bintrans_write_instruction__addiu_etc():
485     */
486     static int bintrans_write_instruction__addiu_etc(unsigned char **addrp,
487     int rt, int rs, int imm, int instruction_type)
488     {
489     uint32_t *a;
490     unsigned int uimm;
491     int alpha_rs, alpha_rt;
492    
493     /* TODO: overflow detection for ADDI and DADDI */
494     switch (instruction_type) {
495     case HI6_ADDI:
496     case HI6_DADDI:
497     return 0;
498     }
499    
500     a = (uint32_t *) *addrp;
501    
502     if (rt == 0)
503     goto rt0;
504    
505     uimm = imm & 0xffff;
506    
507     alpha_rs = map_MIPS_to_Alpha[rs];
508     alpha_rt = map_MIPS_to_Alpha[rt];
509    
510     if (uimm == 0 && (instruction_type == HI6_ADDI ||
511     instruction_type == HI6_ADDIU || instruction_type == HI6_DADDI ||
512     instruction_type == HI6_DADDIU || instruction_type == HI6_ORI)) {
513     if (alpha_rs >= 0 && alpha_rt >= 0) {
514     /* addq rs,0,rt */
515     *a++ = 0x40001400 | (alpha_rs << 21) | alpha_rt;
516     } else {
517     *addrp = (unsigned char *) a;
518     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T0);
519     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
520     a = (uint32_t *) *addrp;
521     }
522     goto rt0;
523     }
524    
525     if (alpha_rs < 0) {
526     /* ldq t0,"rs"(a0) */
527     *addrp = (unsigned char *) a;
528     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T0);
529     a = (uint32_t *) *addrp;
530     alpha_rs = ALPHA_T0;
531     }
532    
533     if (alpha_rt < 0)
534     alpha_rt = ALPHA_T0;
535    
536     /* Place the result of the calculation in alpha_rt: */
537    
538     switch (instruction_type) {
539     case HI6_ADDIU:
540     case HI6_DADDIU:
541     case HI6_ADDI:
542     case HI6_DADDI:
543     if (uimm < 256) {
544     if (instruction_type == HI6_ADDI ||
545     instruction_type == HI6_ADDIU) {
546     /* addl rs,uimm,rt */
547     *a++ = 0x40001000 | (alpha_rs << 21)
548     | (uimm << 13) | alpha_rt;
549     } else {
550     /* addq rs,uimm,rt */
551     *a++ = 0x40001400 | (alpha_rs << 21)
552     | (uimm << 13) | alpha_rt;
553     }
554     } else {
555     /* lda rt,imm(rs) */
556     *a++ = 0x20000000 | (alpha_rt << 21) | (alpha_rs << 16) | uimm;
557     if (instruction_type == HI6_ADDI ||
558     instruction_type == HI6_ADDIU) {
559     /* sign extend, 32->64 bits: addl t0,zero,t0 */
560     *a++ = 0x40001000 | (alpha_rt << 21) | alpha_rt;
561     }
562     }
563     break;
564     case HI6_ANDI:
565     case HI6_ORI:
566     case HI6_XORI:
567     if (uimm >= 256) {
568     /* lda t1,4660 */
569     *a++ = 0x205f0000 | uimm;
570     if (uimm & 0x8000) {
571     /* 01 00 42 24 ldah t1,1(t1) <-- if negative only */
572     *a++ = 0x24420001;
573     }
574     }
575    
576     switch (instruction_type) {
577     case HI6_ANDI:
578     if (uimm < 256) {
579     /* and rs,uimm,rt */
580     *a++ = 0x44001000 | (alpha_rs << 21)
581     | (uimm << 13) | alpha_rt;
582     } else {
583     /* and rs,t1,rt */
584     *a++ = 0x44020000 | (alpha_rs << 21) | alpha_rt;
585     }
586     break;
587     case HI6_ORI:
588     if (uimm < 256) {
589     /* or rs,uimm,rt */
590     *a++ = 0x44001400 | (alpha_rs << 21)
591     | (uimm << 13) | alpha_rt;
592     } else {
593     /* or rs,t1,rt */
594     *a++ = 0x44020400 | (alpha_rs << 21) | alpha_rt;
595     }
596     break;
597     case HI6_XORI:
598     if (uimm < 256) {
599     /* xor rs,uimm,rt */
600     *a++ = 0x44001800 | (alpha_rs << 21)
601     | (uimm << 13) | alpha_rt;
602     } else {
603     /* xor rs,t1,rt */
604     *a++ = 0x44020800 | (alpha_rs << 21) | alpha_rt;
605     }
606     break;
607     }
608     break;
609     case HI6_SLTI:
610     case HI6_SLTIU:
611     /* lda t1,4660 */
612     *a++ = 0x205f0000 | uimm;
613    
614     switch (instruction_type) {
615     case HI6_SLTI:
616     /* cmplt rs,t1,rt */
617     *a++ = 0x400209a0 | (alpha_rs << 21) | alpha_rt;
618     break;
619     case HI6_SLTIU:
620     /* cmpult rs,t1,rt */
621     *a++ = 0x400203a0 | (alpha_rs << 21) | alpha_rt;
622     break;
623     }
624     break;
625     }
626    
627     if (alpha_rt == ALPHA_T0) {
628     *a++ = 0x5fff041f; /* fnop */
629     *addrp = (unsigned char *) a;
630     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
631     a = (uint32_t *) *addrp;
632     }
633    
634     rt0:
635     *addrp = (unsigned char *) a;
636     bintrans_write_pc_inc(addrp);
637     return 1;
638     }
639    
640    
641     /*
642     * bintrans_write_instruction__addu_etc():
643     */
644     static int bintrans_write_instruction__addu_etc(unsigned char **addrp,
645     int rd, int rs, int rt, int sa, int instruction_type)
646     {
647     unsigned char *a, *unmodified = NULL;
648     int load64 = 0, store = 1, ofs, alpha_rd = ALPHA_T0;
649    
650     alpha_rd = map_MIPS_to_Alpha[rd];
651     if (alpha_rd < 0)
652     alpha_rd = ALPHA_T0;
653    
654     switch (instruction_type) {
655     case SPECIAL_DIV:
656     case SPECIAL_DIVU:
657     return 0;
658     }
659    
660     switch (instruction_type) {
661     case SPECIAL_DADDU:
662     case SPECIAL_DSUBU:
663     case SPECIAL_OR:
664     case SPECIAL_AND:
665     case SPECIAL_NOR:
666     case SPECIAL_XOR:
667     case SPECIAL_DSLL:
668     case SPECIAL_DSRL:
669     case SPECIAL_DSRA:
670     case SPECIAL_DSLL32:
671     case SPECIAL_DSRL32:
672     case SPECIAL_DSRA32:
673     case SPECIAL_SLT:
674     case SPECIAL_SLTU:
675     case SPECIAL_MOVZ:
676     case SPECIAL_MOVN:
677     load64 = 1;
678     }
679    
680     switch (instruction_type) {
681     case SPECIAL_MULT:
682     case SPECIAL_MULTU:
683     if (rd != 0)
684     return 0;
685     store = 0;
686     break;
687     default:
688     if (rd == 0)
689     goto rd0;
690     }
691    
692     a = *addrp;
693    
694     if ((instruction_type == SPECIAL_ADDU || instruction_type == SPECIAL_DADDU
695     || instruction_type == SPECIAL_OR) && rt == 0) {
696     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
697     if (!load64) {
698     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
699     }
700     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
701     *addrp = a;
702     goto rd0;
703     }
704    
705     /* t0 = rs, t1 = rt */
706     if (load64) {
707     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
708     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T1);
709     } else {
710     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
711     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
712     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T1);
713     *a++ = 0x02; *a++ = 0x10; *a++ = 0x40; *a++ = 0x40; /* addl t1,0,t1 */
714     }
715    
716     switch (instruction_type) {
717     case SPECIAL_ADDU:
718     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x22; *a++ = 0x40; /* addl t0,t1,rd */
719     break;
720     case SPECIAL_DADDU:
721     *a++ = alpha_rd; *a++ = 0x04; *a++ = 0x22; *a++ = 0x40; /* addq t0,t1,rd */
722     break;
723     case SPECIAL_SUBU:
724     *a++ = 0x20 + alpha_rd; *a++ = 0x01; *a++ = 0x22; *a++ = 0x40; /* subl t0,t1,t0 */
725     break;
726     case SPECIAL_DSUBU:
727     *a++ = 0x20 + alpha_rd; *a++ = 0x05; *a++ = 0x22; *a++ = 0x40; /* subq t0,t1,t0 */
728     break;
729     case SPECIAL_AND:
730     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x22; *a++ = 0x44; /* and t0,t1,t0 */
731     break;
732     case SPECIAL_OR:
733     *a++ = alpha_rd; *a++ = 0x04; *a++ = 0x22; *a++ = 0x44; /* or t0,t1,t0 */
734     break;
735     case SPECIAL_NOR:
736     *a++ = 0x01; *a++ = 0x04; *a++ = 0x22; *a++ = 0x44; /* or t0,t1,t0 */
737     *a++ = alpha_rd; *a++ = 0x05; *a++ = 0xe1; *a++ = 0x47; /* not t0,t0 */
738     break;
739     case SPECIAL_XOR:
740     *a++ = alpha_rd; *a++ = 0x08; *a++ = 0x22; *a++ = 0x44; /* xor t0,t1,t0 */
741     break;
742     case SPECIAL_SLL:
743     *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
744     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
745     break;
746     case SPECIAL_SLLV:
747     /* rd = rt << (rs&31) (logical) t0 = t1 << (t0&31) */
748     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
749     *a++ = 0x21; *a++ = 0x07; *a++ = 0x41; *a++ = 0x48; /* sll t1,t0,t0 */
750     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
751     break;
752     case SPECIAL_DSLL:
753     *a++ = 0x20 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
754     break;
755     case SPECIAL_DSLL32:
756     sa += 32;
757     *a++ = 0x20 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
758     break;
759     case SPECIAL_SRA:
760     *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
761     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
762     break;
763     case SPECIAL_SRAV:
764     /* rd = rt >> (rs&31) (arithmetic) t0 = t1 >> (t0&31) */
765     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
766     *a++ = 0x81; *a++ = 0x07; *a++ = 0x41; *a++ = 0x48; /* sra t1,t0,t0 */
767     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
768     break;
769     case SPECIAL_DSRA:
770     *a++ = 0x80 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
771     break;
772     case SPECIAL_DSRA32:
773     sa += 32;
774     *a++ = 0x80 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
775     break;
776     case SPECIAL_SRL:
777     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48; /* zapnot t1,0xf,t1 (use only lowest 32 bits) */
778     /* Note: bits of sa are distributed among two different bytes. */
779     *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
780     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl */
781     break;
782     case SPECIAL_SRLV:
783     /* rd = rt >> (rs&31) (logical) t0 = t1 >> (t0&31) */
784     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48; /* zapnot t1,0xf,t1 (use only lowest 32 bits) */
785     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
786     *a++ = 0x81; *a++ = 0x06; *a++ = 0x41; *a++ = 0x48; /* srl t1,t0,t0 */
787     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
788     break;
789     case SPECIAL_DSRL:
790     /* Note: bits of sa are distributed among two different bytes. */
791     *a++ = 0x80 + alpha_rd; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
792     break;
793     case SPECIAL_DSRL32:
794     /* Note: bits of sa are distributed among two different bytes. */
795     sa += 32;
796     *a++ = 0x80 + alpha_rd; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
797     break;
798     case SPECIAL_SLT:
799     *a++ = 0xa0 + alpha_rd; *a++ = 0x09; *a++ = 0x22; *a++ = 0x40; /* cmplt t0,t1,t0 */
800     break;
801     case SPECIAL_SLTU:
802     *a++ = 0xa0 + alpha_rd; *a++ = 0x03; *a++ = 0x22; *a++ = 0x40; /* cmpult t0,t1,t0 */
803     break;
804     case SPECIAL_MULT:
805     case SPECIAL_MULTU:
806     if (instruction_type == SPECIAL_MULTU) {
807     /* 21 f6 21 48 zapnot t0,0xf,t0 */
808     /* 22 f6 41 48 zapnot t1,0xf,t1 */
809     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x21; *a++ = 0x48;
810     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48;
811     }
812    
813     /* 03 04 22 4c mulq t0,t1,t2 */
814     *a++ = 0x03; *a++ = 0x04; *a++ = 0x22; *a++ = 0x4c;
815    
816     /* 01 10 60 40 addl t2,0,t0 */
817     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
818    
819     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
820     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
821    
822     /* 81 17 64 48 sra t2,0x20,t0 */
823     *a++ = 0x81; *a++ = 0x17; *a++ = 0x64; *a++ = 0x48;
824     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
825     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
826     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
827     break;
828     case SPECIAL_MOVZ:
829     /* if rt=0 then rd=rs ==> if t1!=0 then t0=unmodified else t0=rd */
830     /* 00 00 40 f4 bne t1,unmodified */
831     unmodified = a;
832     *a++ = 0x00; *a++ = 0x00; *a++ = 0x40; *a++ = 0xf4;
833     alpha_rd = ALPHA_T0;
834     break;
835     case SPECIAL_MOVN:
836     /* if rt!=0 then rd=rs ==> if t1=0 then t0=unmodified else t0=rd */
837     /* 00 00 40 e4 beq t1,unmodified */
838     unmodified = a;
839     *a++ = 0x00; *a++ = 0x00; *a++ = 0x40; *a++ = 0xe4;
840     alpha_rd = ALPHA_T0;
841     break;
842     }
843    
844     if (store && alpha_rd == ALPHA_T0) {
845     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
846     }
847    
848     if (unmodified != NULL)
849     *unmodified = ((size_t)a - (size_t)unmodified - 4) / 4;
850    
851     *addrp = a;
852     rd0:
853     bintrans_write_pc_inc(addrp);
854     return 1;
855     }
856    
857    
858     /*
859     * bintrans_write_instruction__branch():
860     */
861     static int bintrans_write_instruction__branch(unsigned char **addrp,
862     int instruction_type, int regimm_type, int rt, int rs, int imm)
863     {
864     uint32_t *a, *b, *c = NULL;
865     int alpha_rs, alpha_rt, likely = 0, ofs;
866    
867     alpha_rs = map_MIPS_to_Alpha[rs];
868     alpha_rt = map_MIPS_to_Alpha[rt];
869    
870     switch (instruction_type) {
871     case HI6_BEQL:
872     case HI6_BNEL:
873     case HI6_BLEZL:
874     case HI6_BGTZL:
875     likely = 1;
876     }
877    
878     /*
879     * t0 = gpr[rt]; t1 = gpr[rs];
880     *
881     * 50 00 30 a4 ldq t0,80(a0)
882     * 58 00 50 a4 ldq t1,88(a0)
883     */
884    
885     switch (instruction_type) {
886     case HI6_BEQ:
887     case HI6_BNE:
888     case HI6_BEQL:
889     case HI6_BNEL:
890     if (alpha_rt < 0) {
891     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rt, ALPHA_T0);
892     alpha_rt = ALPHA_T0;
893     }
894     }
895    
896     if (alpha_rs < 0) {
897     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T1);
898     alpha_rs = ALPHA_T1;
899     }
900    
901     a = (uint32_t *) *addrp;
902    
903     /*
904     * Compare alpha_rt (t0) and alpha_rs (t1) for equality (BEQ).
905     * If the result was false (equal to zero), then skip a lot
906     * of instructions:
907     *
908     * a1 05 22 40 cmpeq t0,t1,t0
909     * 01 00 20 e4 beq t0,14 <f+0x14>
910     */
911     b = NULL;
912     if ((instruction_type == HI6_BEQ ||
913     instruction_type == HI6_BEQL) && rt != rs) {
914     /* cmpeq rt,rs,t0 */
915     *a++ = 0x400005a1 | (alpha_rt << 21) | (alpha_rs << 16);
916     b = a;
917     *a++ = 0xe4200001; /* beq */
918     }
919     if (instruction_type == HI6_BNE || instruction_type == HI6_BNEL) {
920     /* cmpeq rt,rs,t0 */
921     *a++ = 0x400005a1 | (alpha_rt << 21) | (alpha_rs << 16);
922     b = a;
923     *a++ = 0xf4200001; /* bne */
924     }
925     if (instruction_type == HI6_BLEZ || instruction_type == HI6_BLEZL) {
926     /* cmple rs,0,t0 */
927     *a++ = 0x40001da1 | (alpha_rs << 21);
928     b = a;
929     *a++ = 0xe4200001; /* beq */
930     }
931     if (instruction_type == HI6_BGTZ || instruction_type == HI6_BGTZL) {
932     /* cmple rs,0,t0 */
933     *a++ = 0x40001da1 | (alpha_rs << 21);
934     b = a;
935     *a++ = 0xf4200001; /* bne */
936     }
937     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BLTZ) {
938     /* cmplt rs,0,t0 */
939     *a++ = 0x400019a1 | (alpha_rs << 21);
940     b = a;
941     *a++ = 0xe4200001; /* beq */
942     }
943     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BGEZ) {
944     *a++ = 0x207fffff; /* lda t2,-1 */
945     /* cmple rs,t2,t0 */
946     *a++ = 0x40030da1 | (alpha_rs << 21);
947     b = a;
948     *a++ = 0xf4200001; /* bne */
949     }
950    
951     /*
952     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
953     * and cpu->delay_jmpaddr = pc + 4 + (imm << 2).
954     *
955     * 04 00 26 20 lda t0,4(t5) add 4
956     * c8 01 5f 20 lda t1,456
957     * 4a 04 41 40 s4addq t1,t0,s1 s1 = (t1<<2) + t0
958     */
959    
960     *a++ = 0x20260004; /* lda t0,4(t5) */
961     *a++ = 0x205f0000 | (imm & 0xffff); /* lda */
962     *a++ = 0x4041044a; /* s4addq */
963    
964     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
965     *a++ = 0x213f0000 | TO_BE_DELAYED;
966    
967     /*
968     * Special case: "likely"-branches:
969     */
970     if (likely) {
971     c = a;
972     *a++ = 0xc3e00001; /* br delayed_ok */
973    
974     if (b != NULL)
975     *((unsigned char *)b) = ((size_t)a - (size_t)b - 4) / 4;
976    
977     /* cpu->cd.mips.nullify_next = 1; */
978     /* 01 00 3f 20 lda t0,1 */
979     *a++ = 0x203f0001;
980     ofs = (size_t)&dummy_cpu.cd.mips.nullify_next - (size_t)&dummy_cpu;
981     *a++ = 0xb0300000 | (ofs & 0xffff);
982    
983     /* fail, so that the next instruction is handled manually: */
984     *addrp = (unsigned char *) a;
985     bintrans_write_pc_inc(addrp);
986     bintrans_write_chunkreturn_fail(addrp);
987     a = (uint32_t *) *addrp;
988    
989     if (c != NULL)
990     *((unsigned char *)c) = ((size_t)a - (size_t)c - 4) / 4;
991     } else {
992     /* Normal (non-likely) exit: */
993     if (b != NULL)
994     *((unsigned char *)b) = ((size_t)a - (size_t)b - 4) / 4;
995     }
996    
997     *addrp = (unsigned char *) a;
998     bintrans_write_pc_inc(addrp);
999     return 1;
1000     }
1001    
1002    
1003     /*
1004     * bintrans_write_instruction__jr():
1005     */
1006     static int bintrans_write_instruction__jr(unsigned char **addrp, int rs, int rd, int special)
1007     {
1008     uint32_t *a;
1009     int alpha_rd;
1010    
1011     alpha_rd = map_MIPS_to_Alpha[rd];
1012     if (alpha_rd < 0)
1013     alpha_rd = ALPHA_T0;
1014    
1015     /*
1016     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
1017     * and cpu->delay_jmpaddr = gpr[rs].
1018     */
1019    
1020     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_S1);
1021    
1022     a = (uint32_t *) *addrp;
1023     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
1024     *a++ = 0x213f0000 | TO_BE_DELAYED;
1025     *addrp = (unsigned char *) a;
1026    
1027     if (special == SPECIAL_JALR && rd != 0) {
1028     /* gpr[rd] = retaddr (pc + 8) */
1029     a = (uint32_t *) *addrp;
1030     /* lda alpha_rd,8(t5) */
1031     *a++ = 0x20060008 | (alpha_rd << 21);
1032     *addrp = (unsigned char *) a;
1033     if (alpha_rd == ALPHA_T0)
1034     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rd);
1035     }
1036    
1037     bintrans_write_pc_inc(addrp);
1038     return 1;
1039     }
1040    
1041    
1042     /*
1043     * bintrans_write_instruction__jal():
1044     */
1045     static int bintrans_write_instruction__jal(unsigned char **addrp,
1046     int imm, int link)
1047     {
1048     uint32_t *a;
1049    
1050     a = (uint32_t *) *addrp;
1051    
1052     /* gpr[31] = retaddr (NOTE: mips register 31 is in alpha reg s3) */
1053     if (link) {
1054     *a++ = 0x21860008; /* lda s3,8(t5) */
1055     }
1056    
1057     /* Set the jmpaddr to top 4 bits of pc + lowest 28 bits of imm*4: */
1058    
1059     /*
1060     * imm = 4*imm;
1061     * t0 = ((pc + 4) & ~0x0fffffff) | imm;
1062     *
1063     * 04 00 26 20 lda t0,4(t5) <-- because the jump is from the delay slot
1064     * 23 01 5f 24 ldah t1,291
1065     * 67 45 42 20 lda t1,17767(t1)
1066     * 00 f0 7f 24 ldah t2,-4096
1067     * 04 00 23 44 and t0,t2,t3
1068     * 0a 04 44 44 or t1,t3,s1
1069     */
1070     imm *= 4;
1071     *a++ = 0x20260004;
1072     *a++ = 0x245f0000 | ((imm >> 16) + (imm & 0x8000? 1 : 0));
1073     *a++ = 0x20420000 | (imm & 0xffff);
1074     *a++ = 0x247ff000;
1075     *a++ = 0x44230004;
1076     *a++ = 0x4444040a;
1077    
1078     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
1079     *a++ = 0x213f0000 | TO_BE_DELAYED;
1080    
1081     /* If the machine continues executing here, it will return
1082     to the main loop, which is fine. */
1083    
1084     *addrp = (unsigned char *) a;
1085     bintrans_write_pc_inc(addrp);
1086     return 1;
1087     }
1088    
1089    
1090     /*
1091     * bintrans_write_instruction__delayedbranch():
1092     */
1093     static int bintrans_write_instruction__delayedbranch(
1094     struct memory *mem, unsigned char **addrp,
1095     uint32_t *potential_chunk_p, uint32_t *chunks,
1096     int only_care_about_chunk_p, int p, int forward)
1097     {
1098     unsigned char *a, *skip=NULL, *generic64bit;
1099     int ofs;
1100     uint64_t alpha_addr, subaddr;
1101    
1102     a = *addrp;
1103    
1104     if (!only_care_about_chunk_p) {
1105     /* Skip all of this if there is no branch: */
1106     skip = a;
1107     *a++ = 0; *a++ = 0; *a++ = 0x20; *a++ = 0xe5; /* beq s0,skip */
1108    
1109     /*
1110     * Perform the jump by setting cpu->delay_slot = 0
1111     * and pc = cpu->delay_jmpaddr.
1112     */
1113     /* 00 00 3f 21 lda s0,0 */
1114     *a++ = 0; *a++ = 0; *a++ = 0x3f; *a++ = 0x21;
1115    
1116     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_DELAY_JMPADDR, ALPHA_T0);
1117     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_PC, ALPHA_T3);
1118     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, MIPSREG_PC);
1119     }
1120    
1121     if (potential_chunk_p == NULL) {
1122     if (mem->bintrans_32bit_only) {
1123     /* 34 12 70 a7 ldq t12,4660(a0) */
1124     ofs = (size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc - (size_t)&dummy_cpu;
1125     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1126    
1127     /* 00 00 fb 6b jmp (t12) */
1128     *a++ = 0; *a++ = 0; *a++ = 0xfb; *a++ = 0x6b;
1129     } else {
1130     /*
1131     * If the highest 32 bits of the address are either
1132     * 0x00000000 or 0xffffffff, then the tables used for
1133     * 32-bit load/stores can be used.
1134     *
1135     * 81 16 24 4a srl a1,0x20,t0
1136     * 03 00 20 e4 beq t0,14 <ok1>
1137     * 01 30 20 40 addl t0,0x1,t0
1138     * 01 00 20 e4 beq t0,14 <ok1>
1139     * 01 00 e0 c3 br 18 <nook>
1140     */
1141     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a;
1142     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4;
1143     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x40;
1144     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4;
1145     generic64bit = a;
1146     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1147    
1148     /* 34 12 70 a7 ldq t12,4660(a0) */
1149     ofs = (size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc - (size_t)&dummy_cpu;
1150     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1151    
1152     /* 00 00 fb 6b jmp (t12) */
1153     *a++ = 0; *a++ = 0; *a++ = 0xfb; *a++ = 0x6b;
1154    
1155    
1156     if (generic64bit != NULL)
1157     *generic64bit = ((size_t)a - (size_t)generic64bit - 4) / 4;
1158    
1159     /* Not much we can do here if this wasn't to the same
1160     physical page... */
1161    
1162     *a++ = 0xfc; *a++ = 0xff; *a++ = 0x84; *a++ = 0x20; /* lda t3,-4(t3) */
1163    
1164     /*
1165     * Compare the old pc (t3) and the new pc (t0). If they are on the
1166     * same virtual page (which means that they are on the same physical
1167     * page), then we can check the right chunk pointer, and if it
1168     * is non-NULL, then we can jump there. Otherwise just return.
1169     *
1170     * 00 f0 5f 20 lda t1,-4096
1171     * 01 00 22 44 and t0,t1,t0
1172     * 04 00 82 44 and t3,t1,t3
1173     * a3 05 24 40 cmpeq t0,t3,t2
1174     * 01 00 60 f4 bne t2,7c <ok2>
1175     * 01 80 fa 6b ret
1176     */
1177     *a++ = 0x00; *a++ = 0xf0; *a++ = 0x5f; *a++ = 0x20; /* lda */
1178     *a++ = 0x01; *a++ = 0x00; *a++ = 0x22; *a++ = 0x44; /* and */
1179     *a++ = 0x04; *a++ = 0x00; *a++ = 0x82; *a++ = 0x44; /* and */
1180     *a++ = 0xa3; *a++ = 0x05; *a++ = 0x24; *a++ = 0x40; /* cmpeq */
1181     *a++ = 0x01; *a++ = 0x00; *a++ = 0x60; *a++ = 0xf4; /* bne */
1182     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1183    
1184     /* Don't execute too many instructions. (see comment below) */
1185     *a++ = (N_SAFE_BINTRANS_LIMIT-1)&255; *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8)&255;
1186     *a++ = 0x5f; *a++ = 0x20; /* lda t1,0x1fff */
1187     *a++ = 0xa1; *a++ = 0x0d; *a++ = 0xe2; *a++ = 0x40; /* cmple t6,t1,t0 */
1188     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4; /* bne */
1189     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1190    
1191     /* 15 bits at a time, which means max 60 bits, but
1192     that should be enough. the top 4 bits are probably
1193     not used by userland alpha code. (TODO: verify this) */
1194     alpha_addr = (size_t)chunks;
1195     subaddr = (alpha_addr >> 45) & 0x7fff;
1196    
1197     /*
1198     * 00 00 3f 20 lda t0,0
1199     * 21 f7 21 48 sll t0,0xf,t0
1200     * 34 12 21 20 lda t0,4660(t0)
1201     * 21 f7 21 48 sll t0,0xf,t0
1202     * 34 12 21 20 lda t0,4660(t0)
1203     * 21 f7 21 48 sll t0,0xf,t0
1204     * 34 12 21 20 lda t0,4660(t0)
1205     */
1206    
1207     /* Start with the topmost 15 bits: */
1208     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x3f; *a++ = 0x20;
1209     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1210    
1211     subaddr = (alpha_addr >> 30) & 0x7fff;
1212     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1213     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1214    
1215     subaddr = (alpha_addr >> 15) & 0x7fff;
1216     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1217     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1218    
1219     subaddr = alpha_addr & 0x7fff;
1220     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1221    
1222     /*
1223     * t2 = pc
1224     * t1 = t2 & 0xfff
1225     * t0 += t1
1226     *
1227     * ff 0f 5f 20 lda t1,4095
1228     * 02 00 62 44 and t2,t1,t1
1229     * 01 04 22 40 addq t0,t1,t0
1230     */
1231     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_PC, ALPHA_T2);
1232     *a++ = 0xff; *a++ = 0x0f; *a++ = 0x5f; *a++ = 0x20; /* lda */
1233     *a++ = 0x02; *a++ = 0x00; *a++ = 0x62; *a++ = 0x44; /* and */
1234     *a++ = 0x01; *a++ = 0x04; *a++ = 0x22; *a++ = 0x40; /* addq */
1235    
1236     /*
1237     * Load the chunk pointer (actually, a 32-bit offset) into t0.
1238     * If it is zero, then skip the following.
1239     * Add cpu->chunk_base_address to t0.
1240     * Jump to t0.
1241     */
1242    
1243     *a++ = 0x00; *a++ = 0x00; *a++ = 0x21; *a++ = 0xa0; /* ldl t0,0(t0) */
1244     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<skip> */
1245    
1246     /* ldq t2,chunk_base_address(a0) */
1247     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1248     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x70; *a++ = 0xa4;
1249     /* addq t0,t2,t0 */
1250     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x40;
1251    
1252     /* 00 00 e1 6b jmp (t0) */
1253     *a++ = 0x00; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x6b; /* jmp (t0) */
1254    
1255     /* Failure, then return to the main loop. */
1256     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1257     }
1258     } else {
1259     /*
1260     * Just to make sure that we don't become too unreliant
1261     * on the main program loop, we need to return every once
1262     * in a while (interrupts etc).
1263     *
1264     * Load the "nr of instructions executed" (which is an int)
1265     * and see if it is below a certain threshold. If so, then
1266     * we go on with the fast path (bintrans), otherwise we
1267     * abort by returning.
1268     *
1269     * f4 01 5f 20 lda t1,500 (some low number...)
1270     * a1 0d c2 40 cmple t6,t1,t0
1271     * 01 00 20 f4 bne t0,14 <f+0x14>
1272     */
1273     if (!only_care_about_chunk_p && !forward) {
1274     *a++ = (N_SAFE_BINTRANS_LIMIT-1)&255; *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8)&255;
1275     *a++ = 0x5f; *a++ = 0x20; /* lda t1,0x1fff */
1276     *a++ = 0xa1; *a++ = 0x0d; *a++ = 0xe2; *a++ = 0x40; /* cmple t6,t1,t0 */
1277     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4; /* bne */
1278     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1279     }
1280    
1281     /*
1282     * potential_chunk_p points to an "uint32_t".
1283     * If this value is non-NULL, then it is a piece of Alpha
1284     * machine language code corresponding to the address
1285     * we're jumping to. Otherwise, those instructions haven't
1286     * been translated yet, so we have to return to the main
1287     * loop. (Actually, we have to add cpu->chunk_base_address,
1288     * because the uint32_t is limited to 32-bit offsets.)
1289     *
1290     * Case 1: The value is non-NULL already at translation
1291     * time. Then we can make a direct (fast) native
1292     * Alpha jump to the code chunk.
1293     *
1294     * Case 2: The value was NULL at translation time, then we
1295     * have to check during runtime.
1296     */
1297    
1298     /* Case 1: */
1299     /* printf("%08x ", *potential_chunk_p); */
1300     alpha_addr = *potential_chunk_p + (size_t)mem->translation_code_chunk_space;
1301     ofs = (alpha_addr - ((size_t)a+4)) / 4;
1302     /* printf("%016llx %016llx %i\n", (long long)alpha_addr, (long long)a, ofs); */
1303    
1304     if ((*potential_chunk_p) != 0 && ofs > -0xfffff && ofs < 0xfffff) {
1305     *a++ = ofs & 255; *a++ = (ofs >> 8) & 255; *a++ = 0xe0 + ((ofs >> 16) & 0x1f); *a++ = 0xc3; /* br <chunk> */
1306     } else {
1307     /* Case 2: */
1308    
1309     bintrans_register_potential_quick_jump(mem, a, p);
1310    
1311     /* 15 bits at a time, which means max 60 bits, but
1312     that should be enough. the top 4 bits are probably
1313     not used by userland alpha code. (TODO: verify this) */
1314     alpha_addr = (size_t)potential_chunk_p;
1315     subaddr = (alpha_addr >> 45) & 0x7fff;
1316    
1317     /*
1318     * 00 00 3f 20 lda t0,0
1319     * 21 f7 21 48 sll t0,0xf,t0
1320     * 34 12 21 20 lda t0,4660(t0)
1321     * 21 f7 21 48 sll t0,0xf,t0
1322     * 34 12 21 20 lda t0,4660(t0)
1323     * 21 f7 21 48 sll t0,0xf,t0
1324     * 34 12 21 20 lda t0,4660(t0)
1325     */
1326    
1327     /* Start with the topmost 15 bits: */
1328     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x3f; *a++ = 0x20;
1329     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1330    
1331     subaddr = (alpha_addr >> 30) & 0x7fff;
1332     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1333     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1334    
1335     subaddr = (alpha_addr >> 15) & 0x7fff;
1336     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1337     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1338    
1339     subaddr = alpha_addr & 0x7fff;
1340     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1341    
1342     /*
1343     * Load the chunk pointer into t0.
1344     * If it is NULL (zero), then skip the following jump.
1345     * Jump to t0.
1346     */
1347     *a++ = 0x00; *a++ = 0x00; *a++ = 0x21; *a++ = 0xa0; /* ldl t0,0(t0) */
1348     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<skip> */
1349    
1350     /* ldq t2,chunk_base_address(a0) */
1351     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1352     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x70; *a++ = 0xa4;
1353     /* addq t0,t2,t0 */
1354     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x40;
1355    
1356     /* 00 00 e1 6b jmp (t0) */
1357     *a++ = 0x00; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x6b; /* jmp (t0) */
1358    
1359     /* "Failure", then let's return to the main loop. */
1360     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1361     }
1362     }
1363    
1364     if (skip != NULL) {
1365     *skip = ((size_t)a - (size_t)skip - 4) / 4;
1366     skip ++;
1367     *skip = (((size_t)a - (size_t)skip - 4) / 4) >> 8;
1368     }
1369    
1370     *addrp = a;
1371     return 1;
1372     }
1373    
1374    
1375     /*
1376     * bintrans_write_instruction__loadstore():
1377     */
1378     static int bintrans_write_instruction__loadstore(
1379     struct memory *mem, unsigned char **addrp,
1380     int rt, int imm, int rs, int instruction_type, int bigendian)
1381     {
1382     unsigned char *a, *fail, *generic64bit = NULL, *generic64bitA = NULL;
1383     unsigned char *doloadstore = NULL,
1384     *ok_unaligned_load3, *ok_unaligned_load2, *ok_unaligned_load1;
1385     uint32_t *b;
1386     int ofs, alignment, load = 0, alpha_rs, alpha_rt, unaligned = 0;
1387    
1388     /* TODO: Not yet: */
1389     if (instruction_type == HI6_LQ_MDMX || instruction_type == HI6_SQ) {
1390     return 0;
1391     }
1392    
1393     switch (instruction_type) {
1394     case HI6_LQ_MDMX:
1395     case HI6_LD:
1396     case HI6_LDL:
1397     case HI6_LDR:
1398     case HI6_LWU:
1399     case HI6_LW:
1400     case HI6_LWL:
1401     case HI6_LWR:
1402     case HI6_LHU:
1403     case HI6_LH:
1404     case HI6_LBU:
1405     case HI6_LB:
1406     load = 1;
1407     if (rt == 0)
1408     return 0;
1409     }
1410    
1411     switch (instruction_type) {
1412     case HI6_LDL:
1413     case HI6_LDR:
1414     case HI6_LWL:
1415     case HI6_LWR:
1416     case HI6_SDL:
1417     case HI6_SDR:
1418     case HI6_SWL:
1419     case HI6_SWR:
1420     unaligned = 1;
1421     }
1422    
1423     a = *addrp;
1424    
1425     /*
1426     * a1 = gpr[rs] + imm;
1427     *
1428     * 88 08 30 a4 ldq t0,2184(a0)
1429     * 34 12 21 22 lda a1,4660(t0)
1430     */
1431    
1432     alpha_rs = map_MIPS_to_Alpha[rs];
1433     if (alpha_rs < 0) {
1434     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1435     alpha_rs = ALPHA_T0;
1436     }
1437     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1438    
1439     alignment = 0;
1440     switch (instruction_type) {
1441     case HI6_LQ_MDMX:
1442     case HI6_SQ:
1443     alignment = 15;
1444     break;
1445     case HI6_LD:
1446     case HI6_LDL:
1447     case HI6_LDR:
1448     case HI6_SD:
1449     case HI6_SDL:
1450     case HI6_SDR:
1451     alignment = 7;
1452     break;
1453     case HI6_LW:
1454     case HI6_LWL:
1455     case HI6_LWR:
1456     case HI6_LWU:
1457     case HI6_SW:
1458     case HI6_SWL:
1459     case HI6_SWR:
1460     alignment = 3;
1461     break;
1462     case HI6_LH:
1463     case HI6_LHU:
1464     case HI6_SH:
1465     alignment = 1;
1466     break;
1467     }
1468    
1469     if (unaligned) {
1470     /*
1471     * Unaligned load/store: Perform the host load/store at
1472     * an aligned address, and then figure out which bytes to
1473     * actually load into the destination register.
1474     *
1475     * 02 30 20 46 and a1,alignment,t1
1476     * 31 05 22 42 subq a1,t1,a1
1477     */
1478     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1479     *a++ = 0x31; *a++ = 0x05; *a++ = 0x22; *a++ = 0x42;
1480     } else if (alignment > 0) {
1481     /*
1482     * Check alignment:
1483     *
1484     * 02 30 20 46 and a1,0x1,t1
1485     * 02 70 20 46 and a1,0x3,t1 (one of these "and"s)
1486     * 02 f0 20 46 and a1,0x7,t1
1487     * 02 f0 21 46 and a1,0xf,t1
1488     * 01 00 40 e4 beq t1,<okalign>
1489     * 01 80 fa 6b ret
1490     */
1491     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1492     fail = a;
1493     *a++ = 0x01; *a++ = 0x00; *a++ = 0x40; *a++ = 0xe4;
1494     *addrp = a;
1495     bintrans_write_chunkreturn_fail(addrp);
1496     a = *addrp;
1497     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1498     }
1499    
1500     alpha_rt = map_MIPS_to_Alpha[rt];
1501    
1502     if (mem->bintrans_32bit_only) {
1503     /* Special case for 32-bit addressing: */
1504    
1505 dpavlin 10 if (load)
1506     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_load_32bit) - (size_t)&dummy_cpu;
1507     else
1508     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_store_32bit) - (size_t)&dummy_cpu;
1509 dpavlin 2 /* ldq t12,bintrans_loadstore_32bit(a0) */
1510     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1511    
1512     /* jsr t4,(t12),<after> */
1513     *a++ = 0x00; *a++ = 0x40; *a++ = 0xbb; *a++ = 0x68;
1514    
1515 dpavlin 10 /* Now: a3 = host page, t3 = address of host load/store */
1516 dpavlin 2 } else {
1517     /*
1518     * If the highest 33 bits of the address are either all ones
1519     * or all zeroes, then the tables used for 32-bit load/stores
1520     * can be used.
1521     */
1522     *a++ = 0x81; *a++ = 0xf6; *a++ = 0x23; *a++ = 0x4a; /* srl a1,0x1f,t0 */
1523     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x44; /* and t0,0x1,t0 */
1524     *a++ = 0x04; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<noll> */
1525     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a; /* srl a1,0x20,t0 */
1526     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x40; /* addl t0,0x1,t0 */
1527     *a++ = 0x04; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<ok> */
1528     generic64bit = a;
1529     *a++ = 0x04; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3; /* br <generic> */
1530     /* <noll>: */
1531     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a; /* srl a1,0x20,t0 */
1532     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<ok> */
1533     generic64bitA = a;
1534     *a++ = 0x04; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3; /* br <generic> */
1535    
1536 dpavlin 10 if (load)
1537     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_load_32bit) - (size_t)&dummy_cpu;
1538     else
1539     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_store_32bit) - (size_t)&dummy_cpu;
1540 dpavlin 2 /* ldq t12,bintrans_loadstore_32bit(a0) */
1541     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1542    
1543     /* jsr t4,(t12),<after> */
1544     *a++ = 0x00; *a++ = 0x40; *a++ = 0xbb; *a++ = 0x68;
1545    
1546     /*
1547 dpavlin 10 * Now: a3 = host page
1548     * t3 = (potential) address of host load/store
1549 dpavlin 2 */
1550    
1551     doloadstore = a;
1552     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1553    
1554    
1555     /*
1556     * Generic (64-bit) load/store:
1557     */
1558    
1559     if (generic64bit != NULL)
1560     *generic64bit = ((size_t)a - (size_t)generic64bit - 4) / 4;
1561     if (generic64bitA != NULL)
1562     *generic64bitA = ((size_t)a - (size_t)generic64bitA - 4) / 4;
1563    
1564     *addrp = a;
1565     b = (uint32_t *) *addrp;
1566    
1567     /* Save a0 and the old return address on the stack: */
1568     *b++ = 0x23deff80; /* lda sp,-128(sp) */
1569    
1570     *b++ = 0xb75e0000; /* stq ra,0(sp) */
1571     *b++ = 0xb61e0008; /* stq a0,8(sp) */
1572     *b++ = 0xb4de0010; /* stq t5,16(sp) */
1573     *b++ = 0xb0fe0018; /* stl t6,24(sp) */
1574     *b++ = 0xb71e0020; /* stq t10,32(sp) */
1575     *b++ = 0xb73e0028; /* stq t11,40(sp) */
1576     *b++ = 0xb51e0030; /* stq t7,48(sp) */
1577     *b++ = 0xb6de0038; /* stq t8,56(sp) */
1578     *b++ = 0xb6fe0040; /* stq t9,64(sp) */
1579    
1580     ofs = ((size_t)&dummy_cpu.cd.mips.fast_vaddr_to_hostaddr) - (size_t)&dummy_cpu;
1581    
1582     *b++ = 0xa7700000 | ofs; /* ldq t12,0(a0) */
1583    
1584     /* a1 is already vaddr. set a2 = writeflag */
1585     *b++ = 0x225f0000 | (load? 0 : 1);
1586    
1587     /* Call fast_vaddr_to_hostaddr: */
1588     *b++ = 0x6b5b4000; /* jsr ra,(t12),<after> */
1589    
1590     /* Restore the old return address and a0 from the stack: */
1591     *b++ = 0xa75e0000; /* ldq ra,0(sp) */
1592     *b++ = 0xa61e0008; /* ldq a0,8(sp) */
1593     *b++ = 0xa4de0010; /* ldq t5,16(sp) */
1594     *b++ = 0xa0fe0018; /* ldl t6,24(sp) */
1595     *b++ = 0xa71e0020; /* ldq t10,32(sp) */
1596     *b++ = 0xa73e0028; /* ldq t11,40(sp) */
1597     *b++ = 0xa51e0030; /* ldq t7,48(sp) */
1598     *b++ = 0xa6de0038; /* ldq t8,56(sp) */
1599     *b++ = 0xa6fe0040; /* ldq t9,64(sp) */
1600    
1601     *b++ = 0x23de0080; /* lda sp,128(sp) */
1602    
1603     *addrp = (unsigned char *) b;
1604     a = *addrp;
1605    
1606     /*
1607     * NULL? Then return failure.
1608     * 01 00 00 f4 bne v0,f8 <okzz>
1609     */
1610     fail = a;
1611     *a++ = 0x01; *a++ = 0x00; *a++ = 0x00; *a++ = 0xf4;
1612     bintrans_write_chunkreturn_fail(&a);
1613     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1614    
1615     /* The rest of this code was written with t3 as the address. */
1616    
1617     /* 04 14 00 40 addq v0,0,t3 */
1618     *a++ = 0x04; *a++ = 0x14; *a++ = 0x00; *a++ = 0x40;
1619    
1620     if (doloadstore != NULL)
1621     *doloadstore = ((size_t)a - (size_t)doloadstore - 4) / 4;
1622     }
1623    
1624    
1625     switch (instruction_type) {
1626     case HI6_LQ_MDMX:
1627     /* TODO */
1628     break;
1629     case HI6_LD:
1630     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa4; /* ldq t0,0(t3) */
1631     if (bigendian) {
1632     /* remember original 8 bytes of t0: */
1633     *a++ = 0x05; *a++ = 0x04; *a++ = 0x3f; *a++ = 0x40; /* addq t0,zero,t4 */
1634    
1635     /* swap lowest 4 bytes: */
1636     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1637     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1638     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1639     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1640     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1641     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1642     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1643     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1644     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1645    
1646     /* save result in (top 4 bytes of) t1, then t4. get back top bits of t4: */
1647     *a++ = 0x22; *a++ = 0x17; *a++ = 0x24; *a++ = 0x48; /* sll t0,0x20,t1 */
1648     *a++ = 0x81; *a++ = 0x16; *a++ = 0xa4; *a++ = 0x48; /* srl t4,0x20,t0 */
1649     *a++ = 0x05; *a++ = 0x14; *a++ = 0x40; *a++ = 0x40; /* addq t1,0,t4 */
1650    
1651     /* swap highest 4 bytes: */
1652     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1653     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1654     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1655     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1656     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1657     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1658     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1659     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1660     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1661    
1662     /* or the results together: */
1663     *a++ = 0x01; *a++ = 0x04; *a++ = 0xa1; *a++ = 0x44; /* or t4,t0,t0 */
1664     }
1665     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1666     break;
1667     case HI6_LW:
1668     case HI6_LWU:
1669     if (alpha_rt < 0 || bigendian || instruction_type == HI6_LWU)
1670     alpha_rt = ALPHA_T0;
1671     /* ldl rt,0(t3) */
1672     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1673     *a++ = 0xa0 | ((alpha_rt >> 3) & 3);
1674     if (bigendian) {
1675     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1676     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1677     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1678     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1679     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1680     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1681     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1682     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1683     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1684     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,zero,t0 (sign extend) 32->64 */
1685     }
1686     if (instruction_type == HI6_LWU) {
1687     /* Use only lowest 32 bits: */
1688     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x21; *a++ = 0x48; /* zapnot t0,0xf,t0 */
1689     }
1690     if (alpha_rt == ALPHA_T0)
1691     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1692     break;
1693     case HI6_LHU:
1694     case HI6_LH:
1695     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0x30; /* ldwu from memory */
1696     if (bigendian) {
1697     *a++ = 0x62; *a++ = 0x31; *a++ = 0x20; *a++ = 0x48; /* insbl t0,1,t1 */
1698     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1699     *a++ = 0x01; *a++ = 0x04; *a++ = 0x43; *a++ = 0x44; /* or t1,t2,t0 */
1700     }
1701     if (instruction_type == HI6_LH) {
1702     *a++ = 0x21; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x73; /* sextw t0,t0 */
1703     }
1704     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1705     break;
1706     case HI6_LBU:
1707     case HI6_LB:
1708     if (alpha_rt < 0)
1709     alpha_rt = ALPHA_T0;
1710     /* ldbu rt,0(t3) */
1711     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1712     *a++ = 0x28 | ((alpha_rt >> 3) & 3);
1713     if (instruction_type == HI6_LB) {
1714     /* sextb rt,rt */
1715     *a++ = alpha_rt; *a++ = 0x00; *a++ = 0xe0 + alpha_rt; *a++ = 0x73;
1716     }
1717     if (alpha_rt == ALPHA_T0)
1718     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1719     break;
1720    
1721     case HI6_LWL:
1722     /* a1 = 0..3 (or 0..7 for 64-bit loads): */
1723     alpha_rs = map_MIPS_to_Alpha[rs];
1724     if (alpha_rs < 0) {
1725     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1726     alpha_rs = ALPHA_T0;
1727     }
1728     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1729     /* 02 30 20 46 and a1,alignment,t1 */
1730     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1731    
1732     /* ldl t0,0(t3) */
1733     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
1734    
1735     if (bigendian) {
1736     /* TODO */
1737     bintrans_write_chunkreturn_fail(&a);
1738     }
1739     /*
1740     * lwl: memory = 0x12 0x34 0x56 0x78
1741     * offset (a1): register rt becomes:
1742     * 0 0x12......
1743     * 1 0x3412....
1744     * 2 0x563412..
1745     * 3 0x78563412
1746     */
1747    
1748     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
1749    
1750     /*
1751     10: 03 00 9f 20 lda t3,3
1752     14: a5 05 82 40 cmpeq t3,t1,t4
1753     18: 01 00 a0 e4 beq t4,20 <skip>
1754     */
1755     *a++ = 0x03; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1756     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1757     *a++ = 0x02; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1758    
1759     /* 03 14 20 40 addq t0,0,t2 */
1760     *a++ = 0x03; *a++ = 0x14; *a++ = 0x20; *a++ = 0x40;
1761    
1762     ok_unaligned_load3 = a;
1763     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1764    
1765    
1766    
1767     *a++ = 0x02; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1768     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1769     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1770     /*
1771     * 2 0x563412..
1772     2c: 21 17 21 48 sll t0,0x8,t0
1773     30: 01 10 20 40 addl t0,0,t0
1774     34: 03 f0 7f 44 and t2,0xff,t2
1775     38: 03 04 23 44 or t0,t2,t2
1776     */
1777     *a++ = 0x21; *a++ = 0x17; *a++ = 0x21; *a++ = 0x48;
1778     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1779     *a++ = 0x03; *a++ = 0xf0; *a++ = 0x7f; *a++ = 0x44;
1780     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1781    
1782     ok_unaligned_load2 = a;
1783     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1784    
1785    
1786    
1787     *a++ = 0x01; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1788     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1789     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1790     /*
1791     * 1 0x3412....
1792     2c: 21 17 22 48 sll t0,0x10,t0
1793     30: 01 10 20 40 addl t0,0,t0
1794     34: 23 76 60 48 zapnot t2,0x3,t2
1795     38: 03 04 23 44 or t0,t2,t2
1796     */
1797     *a++ = 0x21; *a++ = 0x17; *a++ = 0x22; *a++ = 0x48;
1798     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1799     *a++ = 0x23; *a++ = 0x76; *a++ = 0x60; *a++ = 0x48;
1800     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1801    
1802     ok_unaligned_load1 = a;
1803     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1804    
1805    
1806    
1807    
1808     /*
1809     * 0 0x12......
1810     2c: 21 17 23 48 sll t0,0x18,t0
1811     30: 01 10 20 40 addl t0,0,t0
1812     34: 23 f6 60 48 zapnot t2,0x7,t2
1813     38: 03 04 23 44 or t0,t2,t2
1814     */
1815     *a++ = 0x21; *a++ = 0x17; *a++ = 0x23; *a++ = 0x48;
1816     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1817     *a++ = 0x23; *a++ = 0xf6; *a++ = 0x60; *a++ = 0x48;
1818     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1819    
1820    
1821     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
1822     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
1823     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
1824    
1825     /* 03 10 60 40 addl t2,0,t2 */
1826     *a++ = 0x03; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
1827    
1828     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T2, rt);
1829     break;
1830    
1831     case HI6_LWR:
1832     /* a1 = 0..3 (or 0..7 for 64-bit loads): */
1833     alpha_rs = map_MIPS_to_Alpha[rs];
1834     if (alpha_rs < 0) {
1835     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1836     alpha_rs = ALPHA_T0;
1837     }
1838     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1839     /* 02 30 20 46 and a1,alignment,t1 */
1840     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1841    
1842     /* ldl t0,0(t3) */
1843     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
1844    
1845     if (bigendian) {
1846     /* TODO */
1847     bintrans_write_chunkreturn_fail(&a);
1848     }
1849     /*
1850     * lwr: memory = 0x12 0x34 0x56 0x78
1851     * offset (a1): register rt becomes:
1852     * 0 0x78563412
1853     * 1 0x..785634
1854     * 2 0x....7856
1855     * 3 0x......78
1856     */
1857    
1858     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
1859    
1860     /*
1861     10: 03 00 9f 20 lda t3,3
1862     14: a5 05 82 40 cmpeq t3,t1,t4
1863     18: 01 00 a0 e4 beq t4,20 <skip>
1864     */
1865     *a++ = 0x03; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1866     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1867     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1868    
1869     /*
1870     2c: 81 16 23 48 srl t0,0x18,t0
1871     b0: 21 36 20 48 zapnot t0,0x1,t0
1872     34: 23 d6 7f 48 zapnot t2,0xfe,t2
1873     38: 03 04 23 44 or t0,t2,t2
1874     */
1875     *a++ = 0x81; *a++ = 0x16; *a++ = 0x23; *a++ = 0x48;
1876     *a++ = 0x21; *a++ = 0x36; *a++ = 0x20; *a++ = 0x48;
1877     *a++ = 0x23; *a++ = 0xd6; *a++ = 0x7f; *a++ = 0x48;
1878     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1879    
1880     ok_unaligned_load3 = a;
1881     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1882    
1883    
1884    
1885     *a++ = 0x02; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1886     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1887     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1888     /*
1889     2c: 81 16 22 48 srl t0,0x10,t0
1890     b4: 21 76 20 48 zapnot t0,0x3,t0
1891     34: 23 96 7f 48 zapnot t2,0xfc,t2
1892     38: 03 04 23 44 or t0,t2,t2
1893     */
1894     *a++ = 0x81; *a++ = 0x16; *a++ = 0x22; *a++ = 0x48;
1895     *a++ = 0x21; *a++ = 0x76; *a++ = 0x20; *a++ = 0x48;
1896     *a++ = 0x23; *a++ = 0x96; *a++ = 0x7f; *a++ = 0x48;
1897     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1898    
1899     ok_unaligned_load2 = a;
1900     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1901    
1902    
1903    
1904     *a++ = 0x01; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1905     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1906     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1907     /*
1908     2c: 81 16 21 48 srl t0,0x8,t0
1909     b8: 21 f6 20 48 zapnot t0,0x7,t0
1910     3c: 23 16 7f 48 zapnot t2,0xf8,t2
1911     40: 03 04 23 44 or t0,t2,t2
1912     */
1913     *a++ = 0x81; *a++ = 0x16; *a++ = 0x21; *a++ = 0x48;
1914     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x20; *a++ = 0x48;
1915     *a++ = 0x23; *a++ = 0x16; *a++ = 0x7f; *a++ = 0x48;
1916     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1917    
1918     ok_unaligned_load1 = a;
1919     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1920    
1921    
1922    
1923    
1924     /*
1925     * 0 0x12......
1926     */
1927     /* 03 14 20 40 addq t0,0,t2 */
1928     *a++ = 0x03; *a++ = 0x14; *a++ = 0x20; *a++ = 0x40;
1929    
1930    
1931    
1932     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
1933     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
1934     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
1935    
1936     /* 03 10 60 40 addl t2,0,t2 */
1937     *a++ = 0x03; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
1938    
1939     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T2, rt);
1940     break;
1941    
1942     case HI6_SQ:
1943     /* TODO */
1944     break;
1945     case HI6_SD:
1946     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1947     if (bigendian) {
1948     /* remember original 8 bytes of t0: */
1949     *a++ = 0x05; *a++ = 0x04; *a++ = 0x3f; *a++ = 0x40; /* addq t0,zero,t4 */
1950    
1951     /* swap lowest 4 bytes: */
1952     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1953     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1954     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1955     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1956     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1957     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1958     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1959     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1960     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1961    
1962     /* save result in (top 4 bytes of) t1, then t4. get back top bits of t4: */
1963     *a++ = 0x22; *a++ = 0x17; *a++ = 0x24; *a++ = 0x48; /* sll t0,0x20,t1 */
1964     *a++ = 0x81; *a++ = 0x16; *a++ = 0xa4; *a++ = 0x48; /* srl t4,0x20,t0 */
1965     *a++ = 0x05; *a++ = 0x14; *a++ = 0x40; *a++ = 0x40; /* addq t1,0,t4 */
1966    
1967     /* swap highest 4 bytes: */
1968     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1969     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1970     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1971     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1972     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1973     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1974     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1975     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1976     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1977    
1978     /* or the results together: */
1979     *a++ = 0x01; *a++ = 0x04; *a++ = 0xa1; *a++ = 0x44; /* or t4,t0,t0 */
1980     }
1981     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb4; /* stq to memory */
1982     break;
1983     case HI6_SW:
1984     if (alpha_rt < 0 || bigendian) {
1985     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1986     alpha_rt = ALPHA_T0;
1987     }
1988     if (bigendian) {
1989     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1990     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1991     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1992     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1993     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1994     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1995     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1996     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1997     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1998     }
1999     /* stl to memory: stl rt,0(t3) */
2000     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
2001     *a++ = 0xb0 | ((alpha_rt >> 3) & 3);
2002     break;
2003     case HI6_SH:
2004     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
2005     if (bigendian) {
2006     *a++ = 0x62; *a++ = 0x31; *a++ = 0x20; *a++ = 0x48; /* insbl t0,1,t1 */
2007     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
2008     *a++ = 0x01; *a++ = 0x04; *a++ = 0x43; *a++ = 0x44; /* or t1,t2,t0 */
2009     }
2010     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0x34; /* stw to memory */
2011     break;
2012     case HI6_SB:
2013     if (alpha_rt < 0) {
2014     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
2015     alpha_rt = ALPHA_T0;
2016     }
2017     /* stb to memory: stb rt,0(t3) */
2018     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
2019     *a++ = 0x38 | ((alpha_rt >> 3) & 3);
2020     break;
2021    
2022     case HI6_SWL:
2023     /* a1 = 0..3 (or 0..7 for 64-bit stores): */
2024     alpha_rs = map_MIPS_to_Alpha[rs];
2025     if (alpha_rs < 0) {
2026     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
2027     alpha_rs = ALPHA_T0;
2028     }
2029     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
2030     /* 02 30 20 46 and a1,alignment,t1 */
2031     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
2032    
2033     /* ldl t0,0(t3) */
2034     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
2035    
2036     if (bigendian) {
2037     /* TODO */
2038     bintrans_write_chunkreturn_fail(&a);
2039     }
2040    
2041     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
2042    
2043     /*
2044     * swl: memory = 0x12 0x34 0x56 0x78
2045     * register = 0x89abcdef
2046     * offset (a1): memory becomes:
2047     * 0 0x89 0x.. 0x.. 0x..
2048     * 1 0xab 0x89 0x.. 0x..
2049     * 2 0xcd 0xab 0x89 0x..
2050     * 3 0xef 0xcd 0xab 0x89
2051     */
2052    
2053     /*
2054     a5 75 40 40 cmpeq t1,0x03,t4
2055     01 00 a0 e4 beq t4,20 <skip>
2056     */
2057     *a++ = 0xa5; *a++ = 0x75; *a++ = 0x40; *a++ = 0x40;
2058     *a++ = 0x02; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2059    
2060     /* 01 10 60 40 addl t2,0,t0 */
2061     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
2062    
2063     ok_unaligned_load3 = a;
2064     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2065    
2066    
2067    
2068    
2069     *a++ = 0xa5; *a++ = 0x55; *a++ = 0x40; *a++ = 0x40;
2070     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2071     /*
2072     2:
2073     e8: 83 16 61 48 srl t2,0x8,t2
2074     ec: 23 f6 60 48 zapnot t2,0x7,t2
2075     f0: 21 16 3f 48 zapnot t0,0xf8,t0
2076     f4: 01 04 23 44 or t0,t2,t0
2077     */
2078     *a++ = 0x83; *a++ = 0x16; *a++ = 0x61; *a++ = 0x48;
2079     *a++ = 0x23; *a++ = 0xf6; *a++ = 0x60; *a++ = 0x48;
2080     *a++ = 0x21; *a++ = 0x16; *a++ = 0x3f; *a++ = 0x48;
2081     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2082    
2083     ok_unaligned_load2 = a;
2084     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2085    
2086    
2087    
2088     *a++ = 0xa5; *a++ = 0x35; *a++ = 0x40; *a++ = 0x40;
2089     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2090     /*
2091     1:
2092     f8: 83 16 62 48 srl t2,0x10,t2
2093     fc: 23 76 60 48 zapnot t2,0x3,t2
2094     100: 21 96 3f 48 zapnot t0,0xfc,t0
2095     104: 01 04 23 44 or t0,t2,t0
2096     */
2097     *a++ = 0x83; *a++ = 0x16; *a++ = 0x62; *a++ = 0x48;
2098     *a++ = 0x23; *a++ = 0x76; *a++ = 0x60; *a++ = 0x48;
2099     *a++ = 0x21; *a++ = 0x96; *a++ = 0x3f; *a++ = 0x48;
2100     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2101    
2102     ok_unaligned_load1 = a;
2103     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2104    
2105    
2106    
2107    
2108    
2109     /*
2110     0:
2111     108: 83 16 63 48 srl t2,0x18,t2
2112     10c: 23 36 60 48 zapnot t2,0x1,t2
2113     110: 21 d6 3f 48 zapnot t0,0xfe,t0
2114     114: 01 04 23 44 or t0,t2,t0
2115     */
2116     *a++ = 0x83; *a++ = 0x16; *a++ = 0x63; *a++ = 0x48;
2117     *a++ = 0x23; *a++ = 0x36; *a++ = 0x60; *a++ = 0x48;
2118     *a++ = 0x21; *a++ = 0xd6; *a++ = 0x3f; *a++ = 0x48;
2119     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2120    
2121    
2122     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
2123     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
2124     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
2125    
2126     /* sdl t0,0(t3) */
2127     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb0;
2128     break;
2129    
2130     case HI6_SWR:
2131     /* a1 = 0..3 (or 0..7 for 64-bit stores): */
2132     alpha_rs = map_MIPS_to_Alpha[rs];
2133     if (alpha_rs < 0) {
2134     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
2135     alpha_rs = ALPHA_T0;
2136     }
2137     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
2138     /* 02 30 20 46 and a1,alignment,t1 */
2139     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
2140    
2141     /* ldl t0,0(t3) */
2142     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
2143    
2144     if (bigendian) {
2145     /* TODO */
2146     bintrans_write_chunkreturn_fail(&a);
2147     }
2148    
2149     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
2150    
2151     /*
2152     * swr: memory = 0x12 0x34 0x56 0x78
2153     * register = 0x89abcdef
2154     * offset (a1): memory becomes:
2155     * 0 0xef 0xcd 0xab 0x89
2156     * 1 0x.. 0xef 0xcd 0xab
2157     * 2 0x.. 0x.. 0xef 0xcd
2158     * 3 0x.. 0x.. 0x.. 0xef
2159     */
2160    
2161    
2162     /*
2163     a5 75 40 40 cmpeq t1,0x03,t4
2164     01 00 a0 e4 beq t4,20 <skip>
2165     */
2166     *a++ = 0xa5; *a++ = 0x75; *a++ = 0x40; *a++ = 0x40;
2167     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2168    
2169     /*
2170     118: 23 17 63 48 sll t2,0x18,t2
2171     11c: 21 f6 20 48 zapnot t0,0x7,t0
2172     120: 01 04 23 44 or t0,t2,t0
2173     */
2174     *a++ = 0x23; *a++ = 0x17; *a++ = 0x63; *a++ = 0x48;
2175     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x20; *a++ = 0x48;
2176     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2177    
2178     ok_unaligned_load3 = a;
2179     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2180    
2181    
2182    
2183    
2184    
2185     *a++ = 0xa5; *a++ = 0x55; *a++ = 0x40; *a++ = 0x40;
2186     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2187     /*
2188     2:
2189     124: 23 17 62 48 sll t2,0x10,t2
2190     128: 21 76 20 48 zapnot t0,0x3,t0
2191     12c: 01 04 23 44 or t0,t2,t0
2192     */
2193     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48;
2194     *a++ = 0x21; *a++ = 0x76; *a++ = 0x20; *a++ = 0x48;
2195     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2196    
2197     ok_unaligned_load2 = a;
2198     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2199    
2200    
2201    
2202     *a++ = 0xa5; *a++ = 0x35; *a++ = 0x40; *a++ = 0x40;
2203     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2204     /*
2205     1:
2206     130: 23 17 61 48 sll t2,0x8,t2
2207     134: 21 36 20 48 zapnot t0,0x1,t0
2208     138: 01 04 23 44 or t0,t2,t0
2209     */
2210     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48;
2211     *a++ = 0x21; *a++ = 0x36; *a++ = 0x20; *a++ = 0x48;
2212     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2213    
2214     ok_unaligned_load1 = a;
2215     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2216    
2217    
2218    
2219     /*
2220     0:
2221     13c: 01 10 60 40 addl t2,0,t0
2222     */
2223     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
2224    
2225    
2226     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
2227     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
2228     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
2229    
2230     /* sdl t0,0(t3) */
2231     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb0;
2232     break;
2233    
2234     default:
2235     ;
2236     }
2237    
2238     *addrp = a;
2239     bintrans_write_pc_inc(addrp);
2240     return 1;
2241     }
2242    
2243    
2244     /*
2245     * bintrans_write_instruction__lui():
2246     */
2247     static int bintrans_write_instruction__lui(unsigned char **addrp,
2248     int rt, int imm)
2249     {
2250     uint32_t *a;
2251    
2252     /*
2253     * dc fe 3f 24 ldah t0,-292
2254     * 1f 04 ff 5f fnop
2255     * 88 08 30 b4 stq t0,2184(a0)
2256     */
2257     if (rt != 0) {
2258     int alpha_rt = map_MIPS_to_Alpha[rt];
2259     if (alpha_rt < 0)
2260     alpha_rt = ALPHA_T0;
2261    
2262     a = (uint32_t *) *addrp;
2263     *a++ = 0x241f0000 | (alpha_rt << 21) | ((uint32_t)imm & 0xffff);
2264     *addrp = (unsigned char *) a;
2265    
2266     if (alpha_rt == ALPHA_T0) {
2267     *a++ = 0x5fff041f; /* fnop */
2268     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
2269     }
2270     }
2271    
2272     bintrans_write_pc_inc(addrp);
2273    
2274     return 1;
2275     }
2276    
2277    
2278     /*
2279     * bintrans_write_instruction__mfmthilo():
2280     */
2281     static int bintrans_write_instruction__mfmthilo(unsigned char **addrp,
2282     int rd, int from_flag, int hi_flag)
2283     {
2284     unsigned char *a;
2285     int ofs;
2286    
2287     a = *addrp;
2288    
2289     /*
2290     * 18 09 30 a4 ldq t0,hi(a0) (or lo)
2291     * 18 09 30 b4 stq t0,rd(a0)
2292     *
2293     * (or if from_flag is cleared then move the other way, it's
2294     * actually not rd then, but rs...)
2295     */
2296    
2297     if (from_flag) {
2298     if (rd != 0) {
2299     /* mfhi or mflo */
2300     if (hi_flag)
2301     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
2302     else
2303     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
2304     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xa4;
2305    
2306     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
2307     }
2308     } else {
2309     /* mthi or mtlo */
2310     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rd, ALPHA_T0);
2311    
2312     if (hi_flag)
2313     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
2314     else
2315     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
2316     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
2317     }
2318    
2319     *addrp = a;
2320     bintrans_write_pc_inc(addrp);
2321     return 1;
2322     }
2323    
2324    
2325     /*
2326     * bintrans_write_instruction__mfc_mtc():
2327     */
2328     static int bintrans_write_instruction__mfc_mtc(struct memory *mem,
2329     unsigned char **addrp, int coproc_nr, int flag64bit, int rt,
2330     int rd, int mtcflag)
2331     {
2332     uint32_t *a, *jump;
2333     int ofs;
2334    
2335     /*
2336     * NOTE: Only a few registers are readable without side effects.
2337     */
2338     if (rt == 0 && !mtcflag)
2339     return 0;
2340    
2341     if (coproc_nr >= 1)
2342     return 0;
2343    
2344     if (rd == COP0_RANDOM || rd == COP0_COUNT)
2345     return 0;
2346    
2347    
2348     /*************************************************************
2349     *
2350     * TODO: Check for kernel mode, or Coproc X usability bit!
2351     *
2352     *************************************************************/
2353    
2354     a = (uint32_t *) *addrp;
2355    
2356     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2357     *a++ = 0xa4300000 | (ofs & 0xffff); /* ldq t0,coproc[0](a0) */
2358    
2359     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
2360     *a++ = 0xa4410000 | (ofs & 0xffff); /* ldq t1,reg_rd(t0) */
2361    
2362     if (mtcflag) {
2363     /* mtc: */
2364     *addrp = (unsigned char *) a;
2365     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rt, ALPHA_T0);
2366     a = (uint32_t *) *addrp;
2367    
2368     if (!flag64bit) {
2369     *a++ = 0x40201001; /* addl t0,0,t0 */
2370     *a++ = 0x40401002; /* addl t1,0,t1 */
2371     }
2372    
2373     /*
2374     * In the general case: Only allow mtc if it does NOT
2375     * change the register!!
2376     */
2377    
2378     switch (rd) {
2379     case COP0_INDEX:
2380     break;
2381    
2382     case COP0_EPC:
2383     break;
2384    
2385     /* TODO: Some bits are not writable */
2386     case COP0_ENTRYLO0:
2387     case COP0_ENTRYLO1:
2388     break;
2389    
2390     case COP0_ENTRYHI:
2391     /*
2392     * Entryhi is ok to write to, as long as the
2393     * ASID isn't changed. (That would require
2394     * cache invalidations etc. Instead of checking
2395     * for MMU3K vs others, we just assume that all the
2396     * lowest 12 bits must be the same.
2397     */
2398     /* ff 0f bf 20 lda t4,0x0fff */
2399     /* 03 00 25 44 and t0,t4,t2 */
2400     /* 04 00 45 44 and t1,t4,t3 */
2401     /* a3 05 64 40 cmpeq t2,t3,t2 */
2402     /* 01 00 60 f4 bne t2,<ok> */
2403     *a++ = 0x20bf0fff;
2404     *a++ = 0x44250003;
2405     *a++ = 0x44450004;
2406     *a++ = 0x406405a3;
2407     jump = a;
2408     *a++ = 0; /* later */
2409     *addrp = (unsigned char *) a;
2410     bintrans_write_chunkreturn_fail(addrp);
2411     a = (uint32_t *) *addrp;
2412     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2413     break;
2414    
2415     case COP0_STATUS:
2416     /* Only allow updates to the status register if
2417     the interrupt enable bits were changed, but no
2418     other bits! */
2419     if (mem->bintrans_32bit_only) {
2420     /* R3000 etc. */
2421     /* t4 = 0x0fe70000; */
2422     *a++ = 0x20bf0000;
2423     *a++ = 0x24a50fe7;
2424     } else {
2425     /* fe 00 bf 20 lda t4,0x00fe */
2426     /* ff ff a5 24 ldah t4,-1(t4) */
2427     *a++ = 0x20bf0000;
2428     *a++ = 0x24a5ffff;
2429     }
2430    
2431     /* 03 00 25 44 and t0,t4,t2 */
2432     /* 04 00 45 44 and t1,t4,t3 */
2433     /* a3 05 64 40 cmpeq t2,t3,t2 */
2434     /* 01 00 60 f4 bne t2,<ok> */
2435     *a++ = 0x44250003;
2436     *a++ = 0x44450004;
2437     *a++ = 0x406405a3;
2438     jump = a;
2439     *a++ = 0; /* later */
2440     *addrp = (unsigned char *) a;
2441     bintrans_write_chunkreturn_fail(addrp);
2442     a = (uint32_t *) *addrp;
2443     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2444    
2445     /* If enabling interrupt bits would cause an
2446     exception, then don't do it: */
2447     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2448     *a++ = 0xa4900000 | (ofs & 0xffff); /* ldq t3,coproc[0](a0) */
2449     ofs = ((size_t)&dummy_coproc.reg[COP0_CAUSE]) - (size_t)&dummy_coproc;
2450     *a++ = 0xa4a40000 | (ofs & 0xffff); /* ldq t4,reg_rd(t3) */
2451    
2452     /* 02 00 a1 44 and t4,t0,t1 */
2453     /* 83 16 41 48 srl t1,0x8,t2 */
2454     /* 04 f0 7f 44 and t2,0xff,t3 */
2455     *a++ = 0x44a10002;
2456     *a++ = 0x48411683;
2457     *a++ = 0x447ff004;
2458     /* 01 00 80 e4 beq t3,<ok> */
2459     jump = a;
2460     *a++ = 0; /* later */
2461     *addrp = (unsigned char *) a;
2462     bintrans_write_chunkreturn_fail(addrp);
2463     a = (uint32_t *) *addrp;
2464     *jump = 0xe4800000 | (((size_t)a - (size_t)jump - 4) / 4);
2465     break;
2466    
2467     default:
2468     /* a3 05 22 40 cmpeq t0,t1,t2 */
2469     /* 01 00 60 f4 bne t2,<ok> */
2470     *a++ = 0x402205a3;
2471     jump = a;
2472     *a++ = 0; /* later */
2473     *addrp = (unsigned char *) a;
2474     bintrans_write_chunkreturn_fail(addrp);
2475     a = (uint32_t *) *addrp;
2476     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2477     }
2478    
2479     *a++ = 0x40201402; /* addq t0,0,t1 */
2480    
2481     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2482     *a++ = 0xa4300000 | (ofs & 0xffff); /* ldq t0,coproc[0](a0) */
2483     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
2484     *a++ = 0xb4410000 | (ofs & 0xffff); /* stq t1,reg_rd(t0) */
2485     } else {
2486     /* mfc: */
2487     if (!flag64bit) {
2488     *a++ = 0x40401002; /* addl t1,0,t1 */
2489     }
2490    
2491     *addrp = (unsigned char *) a;
2492     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T1, rt);
2493     a = (uint32_t *) *addrp;
2494     }
2495    
2496     *addrp = (unsigned char *) a;
2497    
2498     bintrans_write_pc_inc(addrp);
2499     return 1;
2500     }
2501    
2502    
2503     /*
2504     * bintrans_write_instruction__tlb_rfe_etc():
2505     */
2506     static int bintrans_write_instruction__tlb_rfe_etc(unsigned char **addrp,
2507     int itype)
2508     {
2509     uint32_t *a;
2510     int ofs = 0;
2511    
2512     switch (itype) {
2513     case CALL_TLBWI:
2514     case CALL_TLBWR:
2515     case CALL_TLBP:
2516     case CALL_TLBR:
2517     case CALL_RFE:
2518     case CALL_ERET:
2519     case CALL_BREAK:
2520     case CALL_SYSCALL:
2521     break;
2522     default:
2523     return 0;
2524     }
2525    
2526     a = (uint32_t *) *addrp;
2527    
2528     /* a0 = pointer to the cpu struct */
2529    
2530     switch (itype) {
2531     case CALL_TLBWI:
2532     case CALL_TLBWR:
2533     /* a1 = 0 for indexed, 1 for random */
2534     *a++ = 0x223f0000 | (itype == CALL_TLBWR);
2535     break;
2536     case CALL_TLBP:
2537     case CALL_TLBR:
2538     /* a1 = 0 for probe, 1 for read */
2539     *a++ = 0x223f0000 | (itype == CALL_TLBR);
2540     break;
2541     case CALL_BREAK:
2542     case CALL_SYSCALL:
2543     *a++ = 0x223f0000 | (itype == CALL_BREAK? EXCEPTION_BP : EXCEPTION_SYS);
2544     break;
2545     }
2546    
2547     /* Put PC into the cpu struct (both pc and pc_last). */
2548     *a++ = 0xb4d00000 | ofs_pc; /* stq t5,"pc"(a0) */
2549     *a++ = 0xb4d00000 | ofs_pc_last;/* stq t5,"pc_last"(a0) */
2550    
2551     /* Save a0 and the old return address on the stack: */
2552     *a++ = 0x23deff80; /* lda sp,-128(sp) */
2553    
2554     *a++ = 0xb75e0000; /* stq ra,0(sp) */
2555     *a++ = 0xb61e0008; /* stq a0,8(sp) */
2556     *a++ = 0xb0fe0018; /* stl t6,24(sp) */
2557     *a++ = 0xb71e0020; /* stq t10,32(sp) */
2558     *a++ = 0xb73e0028; /* stq t11,40(sp) */
2559     *a++ = 0xb51e0030; /* stq t7,48(sp) */
2560     *a++ = 0xb6de0038; /* stq t8,56(sp) */
2561     *a++ = 0xb6fe0040; /* stq t9,64(sp) */
2562    
2563     switch (itype) {
2564     case CALL_TLBP:
2565     case CALL_TLBR:
2566     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbpr) - (size_t)&dummy_cpu;
2567     break;
2568     case CALL_TLBWR:
2569     case CALL_TLBWI:
2570     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbwri) - (size_t)&dummy_cpu;
2571     break;
2572     case CALL_RFE:
2573     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_rfe) - (size_t)&dummy_cpu;
2574     break;
2575     case CALL_ERET:
2576     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_eret) - (size_t)&dummy_cpu;
2577     break;
2578     case CALL_BREAK:
2579     case CALL_SYSCALL:
2580     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_simple_exception) - (size_t)&dummy_cpu;
2581     break;
2582     }
2583    
2584     *a++ = 0xa7700000 | ofs; /* ldq t12,0(a0) */
2585    
2586     /* Call bintrans_fast_tlbwr: */
2587     *a++ = 0x6b5b4000; /* jsr ra,(t12),<after> */
2588    
2589     /* Restore the old return address and a0 from the stack: */
2590     *a++ = 0xa75e0000; /* ldq ra,0(sp) */
2591     *a++ = 0xa61e0008; /* ldq a0,8(sp) */
2592     *a++ = 0xa0fe0018; /* ldl t6,24(sp) */
2593     *a++ = 0xa71e0020; /* ldq t10,32(sp) */
2594     *a++ = 0xa73e0028; /* ldq t11,40(sp) */
2595     *a++ = 0xa51e0030; /* ldq t7,48(sp) */
2596     *a++ = 0xa6de0038; /* ldq t8,56(sp) */
2597     *a++ = 0xa6fe0040; /* ldq t9,64(sp) */
2598    
2599     *a++ = 0x23de0080; /* lda sp,128(sp) */
2600    
2601     /* Load PC from the cpu struct. */
2602     *a++ = 0xa4d00000 | ofs_pc; /* ldq t5,"pc"(a0) */
2603    
2604     *addrp = (unsigned char *) a;
2605    
2606     switch (itype) {
2607     case CALL_ERET:
2608     case CALL_BREAK:
2609     case CALL_SYSCALL:
2610     break;
2611     default:
2612     bintrans_write_pc_inc(addrp);
2613     }
2614    
2615     return 1;
2616     }
2617    
2618    
2619     /*
2620     * bintrans_backend_init():
2621     *
2622     * This is neccessary for broken 2.95.4 compilers on FreeBSD/Alpha 4.9,
2623     * and probably a few others. (For Compaq's CC, and for gcc 3.x, this
2624     * wouldn't be neccessary, and the old code would have worked.)
2625     */
2626     static void bintrans_backend_init(void)
2627     {
2628     int size;
2629 dpavlin 4 uint32_t *p, *q;
2630 dpavlin 2
2631    
2632     /* "runchunk": */
2633     size = 256; /* NOTE: This MUST be enough, or we fail */
2634     p = (uint32_t *)mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_EXEC,
2635     MAP_ANON | MAP_PRIVATE, -1, 0);
2636    
2637     /* If mmap() failed, try malloc(): */
2638     if (p == NULL) {
2639     p = malloc(size);
2640     if (p == NULL) {
2641     fprintf(stderr, "bintrans_backend_init(): out of memory\n");
2642     exit(1);
2643     }
2644     }
2645    
2646     bintrans_runchunk = (void *)p;
2647    
2648     *p++ = 0x23deffa0; /* lda sp,-0x60(sp) */
2649     *p++ = 0xb75e0000; /* stq ra,0(sp) */
2650     *p++ = 0xb53e0008; /* stq s0,8(sp) */
2651     *p++ = 0xb55e0010; /* stq s1,16(sp) */
2652     *p++ = 0xb57e0018; /* stq s2,24(sp) */
2653     *p++ = 0xb59e0020; /* stq s3,32(sp) */
2654     *p++ = 0xb5be0028; /* stq s4,40(sp) */
2655     *p++ = 0xb5de0030; /* stq s5,48(sp) */
2656     *p++ = 0xb5fe0038; /* stq s6,56(sp) */
2657     *p++ = 0xb7be0058; /* stq gp,0x58(sp) */
2658    
2659     *p++ = 0xa4d00000 | ofs_pc; /* ldq t5,"pc"(a0) */
2660     *p++ = 0xa0f00000 | ofs_n; /* ldl t6,"bintrans_instructions_executed"(a0) */
2661     *p++ = 0xa5100000 | ofs_a0; /* ldq t7,"a0"(a0) */
2662     *p++ = 0xa6d00000 | ofs_a1; /* ldq t8,"a1"(a0) */
2663     *p++ = 0xa6f00000 | ofs_s0; /* ldq t9,"s0"(a0) */
2664     *p++ = 0xa1300000 | ofs_ds; /* ldl s0,"delay_slot"(a0) */
2665     *p++ = 0xa5500000 | ofs_ja; /* ldq s1,"delay_jmpaddr"(a0) */
2666     *p++ = 0xa5700000 | ofs_sp; /* ldq s2,"gpr[sp]"(a0) */
2667     *p++ = 0xa5900000 | ofs_ra; /* ldq s3,"gpr[ra]"(a0) */
2668     *p++ = 0xa5b00000 | ofs_t0; /* ldq s4,"gpr[t0]"(a0) */
2669     *p++ = 0xa5d00000 | ofs_t1; /* ldq s5,"gpr[t1]"(a0) */
2670     *p++ = 0xa5f00000 | ofs_t2; /* ldq s6,"gpr[t2]"(a0) */
2671     *p++ = 0xa7100000 | ofs_tbl0; /* ldq t10,table0(a0) */
2672     *p++ = 0xa7300000 | ofs_v0; /* ldq t11,"gpr[v0]"(a0) */
2673    
2674     *p++ = 0x6b514000; /* jsr ra,(a1),<back> */
2675    
2676     *p++ = 0xb4d00000 | ofs_pc; /* stq t5,"pc"(a0) */
2677     *p++ = 0xb0f00000 | ofs_n; /* stl t6,"bintrans_instructions_executed"(a0) */
2678     *p++ = 0xb5100000 | ofs_a0; /* stq t7,"a0"(a0) */
2679     *p++ = 0xb6d00000 | ofs_a1; /* stq t8,"a1"(a0) */
2680     *p++ = 0xb6f00000 | ofs_s0; /* stq t9,"s0"(a0) */
2681     *p++ = 0xb1300000 | ofs_ds; /* stl s0,"delay_slot"(a0) */
2682     *p++ = 0xb5500000 | ofs_ja; /* stq s1,"delay_jmpaddr"(a0) */
2683     *p++ = 0xb5700000 | ofs_sp; /* stq s2,"gpr[sp]"(a0) */
2684     *p++ = 0xb5900000 | ofs_ra; /* stq s3,"gpr[ra]"(a0) */
2685     *p++ = 0xb5b00000 | ofs_t0; /* stq s4,"gpr[t0]"(a0) */
2686     *p++ = 0xb5d00000 | ofs_t1; /* stq s5,"gpr[t1]"(a0) */
2687     *p++ = 0xb5f00000 | ofs_t2; /* stq s6,"gpr[t2]"(a0) */
2688     *p++ = 0xb7300000 | ofs_v0; /* stq t11,"gpr[v0]"(a0) */
2689    
2690     *p++ = 0xa75e0000; /* ldq ra,0(sp) */
2691     *p++ = 0xa53e0008; /* ldq s0,8(sp) */
2692     *p++ = 0xa55e0010; /* ldq s1,16(sp) */
2693     *p++ = 0xa57e0018; /* ldq s2,24(sp) */
2694     *p++ = 0xa59e0020; /* ldq s3,32(sp) */
2695     *p++ = 0xa5be0028; /* ldq s4,40(sp) */
2696     *p++ = 0xa5de0030; /* ldq s5,48(sp) */
2697     *p++ = 0xa5fe0038; /* ldq s6,56(sp) */
2698     *p++ = 0xa7be0058; /* ldq gp,0x58(sp) */
2699     *p++ = 0x23de0060; /* lda sp,0x60(sp) */
2700     *p++ = 0x6bfa8001; /* ret */
2701    
2702    
2703     /* "jump to 32bit pc": */
2704     size = 128; /* WARNING! Don't make this too small. */
2705     p = (uint32_t *)mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_EXEC,
2706     MAP_ANON | MAP_PRIVATE, -1, 0);
2707    
2708     /* If mmap() failed, try malloc(): */
2709     if (p == NULL) {
2710     p = malloc(size);
2711     if (p == NULL) {
2712     fprintf(stderr, "bintrans_backend_init(): out of memory\n");
2713     exit(1);
2714     }
2715     }
2716    
2717     bintrans_jump_to_32bit_pc = (void *)p;
2718    
2719     /* Don't execute too many instructions: */
2720     *p++ = 0x205f0000 | (N_SAFE_BINTRANS_LIMIT-1); /* lda t1,safe-1 */
2721    
2722     *p++ = 0x40e20da1; /* cmple t6,t1,t0 */
2723 dpavlin 4 q = p; /* *q is updated later */
2724     *p++ = 0xe4200001; /* beq ret (far below) */
2725 dpavlin 2
2726     *p++ = 0x40c01411; /* addq t5,0,a1 */
2727    
2728     /*
2729     * Special case for 32-bit addressing:
2730     *
2731     * t1 = 1023;
2732     * t2 = ((a1 >> 22) & t1) * sizeof(void *);
2733     * t3 = ((a1 >> 12) & t1) * sizeof(void *);
2734     * t1 = a1 & 4095;
2735     */
2736     *p++ = 0x205f1ff8; /* lda t1,1023 * 8 */
2737     *p++ = 0x4a227683; /* srl a1,19,t2 */
2738     *p++ = 0x4a213684; /* srl a1, 9,t3 */
2739     *p++ = 0x44620003; /* and t2,t1,t2 */
2740    
2741     /*
2742     * t10 is vaddr_to_hostaddr_table0
2743     *
2744     * a3 = tbl0[t2] (load entry from tbl0)
2745     */
2746     *p++ = 0x43030412; /* addq t10,t2,a2 */
2747     *p++ = 0x44820004; /* and t3,t1,t3 */
2748     *p++ = 0xa6720000; /* ldq a3,0(a2) */
2749     *p++ = 0x205f0ffc; /* lda t1,0xffc */
2750    
2751     /*
2752 dpavlin 4 * a3 = tbl1[t3] (load entry from tbl1 (which is a3))
2753 dpavlin 2 */
2754     *p++ = 0x42640413; /* addq a3,t3,a3 */
2755     *p++ = 0x46220002; /* and a1,t1,t1 */
2756    
2757     *p++ = 0xa6730000 | ofs_c0; /* ldq a3,chunks[0](a3) */
2758    
2759     /*
2760     * NULL? Then just return.
2761     */
2762     *p++ = 0xf6600001; /* bne a3,<ok> */
2763     *p++ = 0x6bfa8001; /* ret */
2764    
2765     *p++ = 0x40530402; /* addq t1,a3,t1 */
2766     *p++ = 0xa0220000; /* ldl t0,0(t1) */
2767    
2768     /* No translation? Then return. */
2769     *p++ = 0xe4200003; /* beq t0,<skip> */
2770    
2771     *p++ = 0xa4700000 | ofs_cb; /* ldq t2,chunk_base_address(a0) */
2772    
2773     *p++ = 0x40230401; /* addq t0,t2,t0 */
2774     *p++ = 0x6be10000; /* jmp (t0) */
2775    
2776 dpavlin 4 /* Now, update *q to point here: */
2777     *q = 0xe4200000 | (((size_t)p - (size_t)q)/4 - 1); /* beq ret */
2778    
2779 dpavlin 2 /* Return to the main translation loop. */
2780     *p++ = 0x6bfa8001; /* ret */
2781     }
2782    

  ViewVC Help
Powered by ViewVC 1.1.26