/[gxemul]/trunk/src/bintrans_alpha.c
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/src/bintrans_alpha.c

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2 - (hide annotations)
Mon Oct 8 16:17:48 2007 UTC (16 years, 6 months ago) by dpavlin
File MIME type: text/plain
File size: 83665 byte(s)
++ trunk/HISTORY	(local)
$Id: HISTORY,v 1.676 2005/04/07 15:14:55 debug Exp $

Changelog for GXemul:
---------------------

20030829	Skeleton. ELF stuff. Some instructions.
20030830	Simple test programs using +-*/^|&%, function calls,
		loops, and stuff like that work.
20030903	Adding more instructions, fixing some bugs.
20030907	Making adding of memory mapped devices easier, although
		the framework isn't built for speed.
		Adding a -q switch to turn of debug output.
20030911	Trying to fix some bugs. Minor changes. Some COP0
		registers are now meaningful.
20030919	Making APs (non-bootstrap cpus) available via a simple
		'mp' device. Implementing ll/lld and sc/scd (for atomic
		memory updates, needed by MP operating systems).
20030923	Minor updates: more instructions (divu, mulu, lwu,
		perhaps some more), and opcode usage statistics.
20030924	If the next instruction is nullified (for 'branch
		likely' type of instructions), counters for delays etc
		are now decreased as they should.
		Adding some comments.
		Adding instructions: movz, movn.
		Adding a simple mandelbrot test to mipstest.c.
20030925	Adding instructions: bltzl, bgezl, lh, lhu, sh, mfc*,
		mtc*.
		Adding a dummy instructions: sync, cache.
		Adding minimal DECstation PROM functionality: printf()
		and getsysid() callback functions.
		Beginning work on address translation.
20030927	Adding some more cop0 functionality (tlb stuff).
		Adding mc146818 real-time clock. (Skeleton stuff.)
20030928	Adding a dc7085 serial console device (dummy, but enough
		to output chars to the screen). NetBSD uses this for
		the MIPSMATE 5100.
20030929	Working on the TLB stuff.
		Adding instructions: srlv, tlbwr, tlbr, tlbp, eret.
20030930	Trying to find a bug which causes NetBSD to bug out, but
		it is really hard.
		Adding some a.out support (for loading an old
		OpenBSD 2.8/pmax kernel image).
		Adding instructions: lwc*, ldc*, swc1 and swc3.
		Beginning to add special code to handle the differences
		between R4000 (the default emulation) and R2000/R3000.
20031001	Symbol listings produced by 'nm -S' can be used to
		show symbolic names for addresses. (-S)
20031002	Fixing the i/d fake cache for R2000/R3000. It's still
		just an ugly hack, though.
		Fixing minor bugs to make the 3100 emulation use the
		dc device (serial console) correctly. So far, 5100 and
		3100 are the only ones that get far enough to print
		stuff, when booting NetBSD.
20031004	Adding skeleton Cobalt machine emulation (-E).
		Adding a dummy ns16550 serial controller, used by the
		Cobalt machine emulation.
20031006	Adding unaligned load/store instructions (lwl, lwr,
		ldl, ldr, swl, swr, sdl, sdr), although they are not
		tested yet.
		Fixed a "data modified on freelist" bug when running
		NetBSD/cobalt: setting the top bit of the index register
		when a tlbp fails (as the R4000 manual says) isn't
		sufficient, I had to clear the low bits as well.
		Adding break and syscall instructions, but they are not
		tested yet.
		Adding a 'gt' device, faking a PCI bus, for the Cobalt
		emulation.
20031008	Adding initial support for HPCmips (-F), a framebuffer
		device using X11. NetBSD/hpcmips can output pixels to
		the framebuffer, but that's about it.
20031009	Fixing the NetBSD/pmax bug: the "0/tftp/netbsd" style
		bootstring was only passed correctly in the bootinfo
		block, it needs to be passed as argv[0] as well.
		Adding instructions: mtlo, mthi.
		Rearrangning the source tree layout.
		Adding console input functionality. The NetBSD/cobalt
		kernel's ddb can now be interacted with.
20031010	Adding experimental (semi-useless) -t option, to show
		a function call tree while a program runs.
		Linux/cobalt now prints a few messages, but then hangs
		at "Calibrating delay loop..." unless an ugly hack is
		used (setting a word of memory at 0x801e472c to non-zero).
20031013	Adding a framebuffer device used in DECstation 3100;
		VFB01 for mono is implemented so far, not yet the
		VFB02 (color) variant.  Rewriting the framebuffer
		device so that it is usable by both HPCmips and DECstation
		emulation.
20031014	Minor fixes. Everything should compile and run ok
		both with and without X11.
20031015	Adding support for ECOFF binary images; text, data,
		and symbols are loaded. (Playing around with ultrixboot
		and ultrix kernels.)
20031016	The DECstation argv,argc stuff must be at 0xa0000000,
		not 0x80000000, or Ultrix kernels complain.
		Adding R2000/R3000 'rfe' instruction.
		Implementing more R2K/R3K tlb specific stuff, so that
		NetBSD boots and uses the tlb correctly, but much of
		it is ugly. (Needs to be separated in a cleaner way.)
		ECOFF symbols sizes are now calculated, so that offsets
		within symbols are usable.
20031017	DECstation bootstrings now automatically include the
		correct name of the kernel that is booting.
		Ultrix boots a bit.
20031018	ELF symbols are now read automatically from the binary.
		-t trace looks a bit better (string arguments are shown).
		Trying to get initial R5900 stuff working (the 128-bit
		CPU used in Playstation 2).
		Fixing a minor bug to make the VFB02 (color framebuffer)
		device work better, but it is still just 256 grayscales,
		not real color. Ultrix can now use the framebuffer (it
		calls it PMAX-CFB).
		A machine can now consist of CPUs of different types.
		Adding instructions: daddi, mov_xxx, mult_xx. The xxx
		instructions are not documented MIPS64 instructions,
		but NetBSD/playstation2 uses them. Perhaps VR5432
		instructions?
		Adding sign-extension to 32-bit mult.
		Adding Playstation 2 devices: dmac (DMA controller),
		gs (Graphic something?), and gif (graphics something
		else, which has access to the PS2's framebuffer).
		NetBSD/playstation2 works a bit, and prints a few
		bootup messages.
20031020	The cpu_type field of the cpu struct now contains
		usable values in a much better form than before. This
		simplifies adding of new CPU types.
20031021	Fixing an interrupt related bug: pc_last was used, but
		for interrupts this was incorrect. Fixed now.
		Fixing a load/store related bug: if a load into a
		register was aborted due to an exception, the register
		was still modified.
		The mc146818 rtc now reads its time from the system's
		time() function.
		Fixing another exception bug: if loading an instruction
		caused an exception, something bogus happened as the
		emulator tried to execute the instruction anyway. This
		has been fixed now.
20031023	Adding a quick hack which skips "while (reg --) ;"
		kind of loops.
		NetBSD/pmax suddenly reached userland (!), but only
		once and attempts to repeat it have failed. I believe
		it is problems with my interrupt handling system.
20031024	Adding 8-bit color palette support to the framebuffer.
		Connecting the pmax vdac device to the framebuffer's
		rgb palette.
		Fixing a bug in the dc device, so that console input
		is possible; interaction with NetBSD/pmax's built-in
		kernel debugger works now.
		Symbol sizes for file formats where symbol size isn't
		included are now calculated regardless of file format.
		Physical memory space can now be smaller than 64 bits,
		improving emulation speed a bit.
		Doing other minor performance enhancements by moving
		around some statements in critical parts of the code.
20031025	Minor changes to the dc device.
20031026	Adding support for reading symbols directly from
		a.out files. (Works with OpenBSD/pmax binaries.)
		Hardware devices may now register "tick functions" at
		specific cycle intervals in a generic fashion.
		All four channels of the dc serial controller device
		should now work; playing around with keyboard scan
		code generation when using the DECstation framebuffer.
		Making various (speed) improvements to the framebuffer
		device.
20031027	Playing around with the sii SCSI controller.
20031028	Minor fixes.
		Adding an SGI emulation mode (-G), and some ARCBIOS
		stuff, which SGIs seem to use.
		Adding getbitmap() to the DEC prom emulation layer,
		so some more -D x models become more usable.
		Adding a dummy 'ssc' serial console device for
		DECsystem 5400 emulation.
		Playing around with TURBOchannel stuff.
20031030	Minor fixes.
		Adding the sub instruction. (Not tested yet?)
		Sign-extending the results of multu, addi,addiu,
		add,addu,sub,subu,mfcZ.
		Adding a colorplanemask device for DECstation 3100.
		Fixed the NetBSD/pmax bug: I had forgotten to reset
		asid_match to 0 between tlb entry checks. :-)  Now
		userland runs nicely...
20031031	Fixing more bugs:  unaligned load/store could fail
		because of an exception, but registers could be "half
		updated". This has been fixed now.  (As a result,
		NetBSD/pmax can now run with any of r2000,r3000,r4000,
		r4400, or r5000.)
		Adding some R5K and R10000 stuff.  (Note: R5K is NOT
		R5000. Weird.)
		Adding dummy serial console (scc) for MAXINE.
		MAXINE also works with framebuffer, but there is no
		color palette yet (only black and white output).
20031101	Moving code chunks around to increase performance by
		a few percent.
		The opcode statistics option (-s) now shows opcode
		names, and not just numbers. :-)
		Fixing the bug which caused NetBSD/pmax to refuse
		input in serial console mode, but not in keyboard/
		framebuffer mode: the osconsole environment variable
		wasn't set correctly.
		Adding DEC PROM getchar() call.
		The transmitter scanner of the dc device now scans
		all four channels at once, for each tick, so serial
		output is (approximately) 4 times faster.
20031103	Adding a dummy BT459 vdac device, which does nothing
		but allows a PMAG-BA turbochannel graphics card to be
		used as framebuffer.
		Several DECstation machines (-D 2, 3, and 4) can now
		use TURBOchannel option card framebuffers as console,
		for output. (Keyboard input is still not implemented
		for those models.)  Only PMAG-AA (1280x1024x8) and
		PMAG-BA (1024x864x8), both using BT459 vdac, have
		been tested so far.
		Modifying the X11 routines so that several framebuffer
		windows now can be used simultaneously (if several
		graphics option cards are to be emulated concurrently).
20031104	DEC MIPSMATE 5100 (KN230) interrupts are shared
		between devices. I've added an ugly hack to allow
		that to work, which makes it possible to boot NetBSD
		into userland with serial console.
20031106	Removing the -S (symbol) option, as symbol files can
		now be given in any order together with other file
		names to be loaded.
		cookin tipped me about using (int64_t) (int32_t)
		casts instead of manually sign-extending values.
		Casting sometimes increases performance, sometimes
		decreases. It's tricky.
		Importing mips64emul into CVS.
20031107	Adding a generic ARC emulation mode.
		Increasing performance of the framebuffer by not
		updating it (or the XImage) if a write to the
		framebuffer contains exactly what is already in it.
		(This improves scrolling speed and initialization.)
		Adding initial MIPS16 support.
		Adding initial disk image support (-d command line
		option), but this will not be used until I get some
		kind of SCSI-controller emulation working.
20031108	Adding the first MIPS16 instructions: "move y,X",
		"ld y,D(x)", and "daddiu S,K" (but the last one
		doesn't work yet).
		Fixing the console environment variable for
		Enough of the 'asc' controller is now implemented
		to let NetBSD get past scsi disk detection when
		no disk images are used.
		DECstation machine type 2; both serial console and
		graphical console work now.
		Other X-windows bit-depths than 24 bits work now,
		but colors are still not correct in non-24 bit modes.
		Keypresses in X framebuffer windows are now
		translated into console keypresses. (Normal keys, but
		not cursor keys or other special keys.)
20031111	Adding support for X11 using non-24-bit output.
20031120	Adding X11 mouse event to emulated mouse event
		translation, but it's not tested yet.
		Trying to get more of the SCSI controller emulation
		to work.
20031124	Raw binaries can now be loaded into memory.
20031204	Adding srec binary support.
20031220	Adding some super-ugly arcbios emulation code.
		Making some progress on the SGI and ARC machine
		emulations.
20031222	SGI and ARC progress. Multiple CPUs are now added to
		the arcbios component tree (although NetBSD cannot
		actually use more than one).
20031228	Adding 'crime' and 'macepci' fake devices for SGI
		emulation.
		Finally implementing the cop0 'compare' register.
		Improvements to the ns16550 device, but it is still
		incomplete.
		SGI userland is now reached, but interaction is broken
		(due to the buggy ns16550).
20031229	Adding some more instructions: teq, dsllv
		Adding a Nintendo 64 emulation mode (skeleton).
		Adding R4300 and R12000 to the cpu list.
20031230	Adding bltzal, bltzall, bgezal, bgezall (not really
		tested yet).
		Fixing the 16550 serial controller device (by not
		supporting fifo, so in fact it emulates a 16450
		instead).  This causes NetBSD/sgimips to run nicely
		into userland, sysinst, and so on.
		Some ARC/RD94 interrupts seem to work ok now, but
		i/o interrupts are still not correctly implemented.
		NetBSD/arc userland is reached and can be interacted
		with, but there's no sysinst (?).
20040103	Trying to get some Irix stuff to work, but it's hard.
		Fixing some Cobalt/linux problems.
20040104	Adding a dummy 8250 device, so that Linux/sgimips can output
		console messages.
		Adding dmultu. (The same as dmult, so I'm not sure it's correct.
		Perhaps dmultu is correct and dmult is wrong...)
		Fixing a bug in unaligned load/stores of 64-bit values (a cast
		was needed).
		Linux/sgimips in 64-bit works a bit more than before.
		Adding simple (polled) input functionality to dev_zs.
		Making some progress on SGI-IP22 (IP32 still works best,
		though).
		Fixing the mc146818 clock device in ARC/NEC and SGI emulation
		modes, the year field was not correct.
		Adding a fake 'pref' instruction (lwc3).
20040106	Separating out memory.h from misc.h.
		Refactoring of a lot of small code fragments.
		The PCI bus device is now shared between Cobalt, SGI, and ARC.
		Support for RAM mirroring (dev_ram.c, not really tested yet).
		Ugly hack to select the largest of ELF string symbol tables,
		if there are more than one.
		Memory hole fix for ARCBIOS, and a fix for very large (>= 4GB)
		amounts of emulated RAM.
		TGA (DEC 21030) PCI graphics device. NetBSD/arc can boot with
		this card and use it as a framebuffer console.
20040107	Adding a fix (partly incorrect) to daddi, to allow Linux/sgimips
		to boot in 64-bit mode.
20040108	Fixing a sll/nop bug (rd==0 for nop, not sa==0 as before).
20040109	Trying to get an SGI-IP32 PROM image to boot.
20040110	Faking R10000 cache things.
		The PROM image boots, although it takes almost forever for it
		to realize that there is no keyboard.
		The 'gbe' SGI-IP32 graphics device works enough to display the
		Linux framebuffer penguin in the upper left corner :-)
20040111	-p and -P addresses can now be given as symbol names, not just
		numeric values.
		Experimenting with adding a PCIIDE (dev_wdc) controller to the
		Cobalt emulation.
20040120	Adding src/bintrans.c. No code yet, but this is a place for
		ideas to be written down.
		Increasing performance a little bit by inlining the check for
		interrupts (which occurs for every instruction).
20040124	Experimenting with pure userland (syscall) emulation.
20040127	Fixes for compiling under Solaris.
20040206	Some bintrans experiments.
20040209	Adding some simple Ultrix userland emulation syscalls.
20040211	Adding decprom_dump_txt_to_bin.c to the experiments/ dir.
		Adding a section to doc/ on how to use DECstation PROM dumps.
		Adding a hello world example to doc/ as well.
20040218	TURBOchannel slots that are empty now return a DBE exception,
		so that Ultrix and DECstation PROMs don't complain about
		broken TURBOchannel ROMs.
		Working some more on the machine-dependant interrupt stuff.
20040219	Trying out some Linux/DECstation kernels (semi-successfully).
20040222	YES! I finally found the bug that caused Linux/SGI-IP32 to only
		work on Alpha, not on 32-bit machines.  It was a shift left,
		probably done using 6 bits on alpha, 5 bits on 32-bit machines.
20040223	Some minimal DEC KN5800 progress; Ultrix prints some boot
		messages, detects 16 XMI R3000 cpus, and get a NULL panic.
		It's all fake, though, the CPUs don't actually work.
		Still, better than nothing :-)
20040225	An Ultrix OSF1 kernel with a ramdisk now boots :-)  (It was
		a problem with ultrixboot not giving the same arguments as
		NetBSD's boot program.)
20040225(later)	Fixing a bug in the DECstation dc serial device; digits 0-9
		were translated to numeric keypad 0-9, not the normal 0-9.
		(This caused Ultrix to print escape sequences instead of
		digits.)
20040226	Some progress on machine-dependant interrupt delivery
		for -D7 (Maxine) and -D4, and some more 'scc' serial
		controller featuers are implemented (but no interrupts/
		dma/keyboard/mouse stuff yet).
20040228	Progress on the scc controller; -D4 works in both serial
		console mode and with keyboard (graphical console), but no
		mouse yet.
20040301	SGI mace interrupts are now done using the new machine-
		independant interrupt system.
20040303	Fixing an R5900 bug; the lowest 6 bits have special meaning
		for coprocessor functions, not just 5 bits as on non-R5900
		CPUs. (This fixes a bug which caused NetBSD to crash.)
20040304	Adding enough (fake) DMA capabilities to the ioasic device
		to allow Ultrix to print boot messages in the -D3, -D4,
		and -D7 modes, and also print graphical console messages
		in -D4 and -D7 modes.
		-D11 (DEC5500) polled getchar added (to the 'ssc' device).
		Adding the 'madd' instruction (including R5900 weird stuff).
20040304(later)	Playstation 2's GIF can now copy 640x16 pixel chunks, allowing
		NetBSD to scroll up the framebuffer.  The cursor also works
		better now.
		Playstation 2 bootinfo RTC data should now be passed correctly
		to the running kernel.
		DECstation rtc year should be either 72 or 73, anything else
		will cause Ultrix to give a warning about invalid year.
20040306	Combining playstation2's dmac, interrupt, and timer devices
		into one (ps2_stuff).
		Adding some R5900 instructions: mfsa, mtsa, pmfhi, pmflo, por,
		lq, and sq.  (Most of them are just guesses, though.)
		Implementing my own XImage putpixel routine, which can be
		inlined... significantly faster than normal XPutPixel. :-)
20040307	Implementing the basic functionality of a "PMAG-CA" pixelstamp
		accellerated framebuffer device. Works with NetBSD and
		Ultrix, but no cursor or color support.
20040308	PMAG-CA, -DA, and -FA pixelstamps seem to work now.
		Adding a hack to allow a pmax/mach kernel to be loaded (it's
		a COFF file with 0 (!) sections).
		Initial test of bt459 + framebuffer cursor support.
20040309	Fixes/updates of dev_dec5800 and dev_ssc (and dev_decxmi) allow
		a KN5800 Ultrix-OSF1-ramdisk kernel to boot all the way into
		userland and be interacted with.
		The bt459 cursor should now look semi-nice, but it is still
		a bit fake.
20040310	Moving the DEC CCA stuff from src/machine.c into a separate
		device file (devices/dev_deccca.c).
		An ugly hack added to allow some more OSF/1 kernels (almost
		a.out, but without many of the header fields) to load.
20040314	Adding PMAG-JA and PMAG-RO (1280x1024 x 8-bit) TURBOchannel
		graphics devices. They work in Ultrix, but only monochrome
		and no cursor, because there are no ramdacs or such yet.
20040315	Pixelstamp solid fill now supports colors other than just
		zero-fill.
		Adding a (new) regression test skeleton.
20040321	Some really minor updates.
20040323	Fixes to allow SGI-IP20 and IP22 to work a bit better
		(aliased memory), and adding "private" firmware-like vectors
		to arcbios emul. An IP22 Irix kernel gets far enough to
		print an assertion warning (and then double panics). :-)
20040324	Adding a generalization hack to the SCC serial controller
		to work with SGI-IP19 (in addition to DECstations).
		Adding the 'sdc1' instruction.
		Some progress on various SGI emulation modes.
20040325	Minor updates.
20040326	Fixed a 'madd' bug (r5900). NetBSD/playstation2 now reaches
		userland correctly.  And a simple fix which allows NetBSD
		timer interrupts to be triggered; NetBSD uses T_MODE_CMPE
		(compare), while Linux uses _OVFE (overflow).
20040328	Linux on Playstation 2 boots a bit. The Playstation 2
		graphics controller has been extended to work better with
		NetBSD, and to include some Linux support as well.
		Some interrupt handling enhancements on Playstation 2,
		needed for Linux' dma.
		128-bit loads and stores (lq and sq) are allowed, although
		the top half of quadwords are not modified by other
		instructions. (Linux uses lq and sq.)
		Big-endian X Windows servers now display correct rgb color,
		not bgr as before.
20040330	Some minor updates to the documentation.
20040401	Adding a dummy ps2 OHCI device.
20040402	Progress on the asc SCSI controller.
20040406	Hack to allow ./configure, make to work on HP-UX B.11.00
		on HPPA-RISC, gcc 3.3.2. (Does not work with HP's cc.)
		More progress on the asc SCSI controller. Fixing INQUIRY,
		adding READ_CAPACITY, adding READ. Works a bit with NetBSD
		and some (but not all) Ultrix kernels, on DECstation type 2.
		Adding WRITE, SYNCRONIZE_CACHE.
		Mounting disks works in NetBSD :-)  It is a bit buggy,
		though. Or something else is buggy.
20040407	The bug is triggered by gunzip during NetBSD/pmax install.
20040408	Fixing a bug (non-nul-terminated string) which caused X11
		cursors to not display on Solaris.
		Unnecessary X11 redraws are skipped (removes some weird
		delays that existed before), and cursors are redrawn on
		window exposure. (The cursor functionality has been moved
		from dev_fb.c to x11.c.)
20040411	Fixing the DC7085 device so that Ultrix doesn't behave weird
		if both tx and rx interrupts occur at the same time.
		More advancements on the asc SCSI controller.
		More disk image filename prefixes are now recognized; c (for
		CD-ROM, as before), d for disk, b for boot device, r for
		read-only, and 0-7 for scsi id.
		Mounting disks works in Ultrix. Installing to disk usually
		crashes for various reasons, but an OSF/1 install gets
		relatively far (similar to the NetBSD/pmax install).
20040412	Trying to find the bug.
20040415	Finally found and fixed the bug; SCSI reads and writes
		(actually, any data in or data out) can be split up into
		multiple DMA transfers. That stuff was only partially
		implemented, and the part that was implemented was buggy.
		It works now. NetBSD/pmax and Ultrix 4.3 seems to like
		the SCSI stuff enough to install almost all the way.
20040415 (more)	Adding a hack which allows a host's cdrom device to be used as
		a cdrom device inside the emulator, eg /dev/cd0c.
		Making the cycle counter int64_t instead of long, as a 'long'
		overflows too easily on 32-bit machines. (The bug is still
		there, though.)
		I've now verified that a full NetBSD/pmax install can be done.
		If using a PMAG-AA graphics board, startx brings up X :-)
		mips64emul can be compiled inside NetBSD inside mips64emul,
		and it can run NetBSD in that environment. (I'm getting
		dizzy... :-)
20040417	Moving some coprocessor stuff from cpu.c to coproc.c.
20040424	Adding a BT455 vdac for PMAG-AA. Black and white are now
		rendered correctly in Xpmax.
		Adding colormap support to the BT459 device, for PMAG-BA.
20040425	Fixing a buffer length bug, which caused an Ultrix 4.5
		install to bug out on an i386 host.
20040429	FPU experiments.
20040502	More FPU experiments.
		Speedup for exception debug messages:  in quiet mode, debug
		messages were still evaluated, which took a relatively
		large amount of time.
20040503	Most FPU stuff fixed, but there is at least one known bug
		left; ps axu in NetBSD triggers it (ps loops forever).
20040504	A default install of Ultrix 4.5 succeeded! It boots up with
		a graphical login.
		Fixing the keyboard repetition bug (a lk201 "up" (release)
		scancode is now sent after every key).
20040505	Both CR and LF now produce the same lk201 scancode, so that
		pressing 'enter' works as expected in Ultrix.
20040506	Adding a vaddr to paddr translation cache, causing a speedup
		of perhaps 50% or more.
20040507	Fixing PMAG-BA color for Ultrix. (Ultrix relies on interrupts
		coming from the TURBOchannel slot to update the palette.)
20040508	Fixing cursor positioning for PMAG-BA.
20040511	Prints current nr of instructions per seconds, not only
		average, when using -N.
20040515	Some more bintrans experiments.
20040606	Adding ARCBIOS GetReadStatus() and Read().
		Adding some instructions: tlt, tltu, tge, tgeu, tne.
20040607	Adding the dsub instruction.
		Some minimal progress on SGI-IP30 emulation.
		Applying a patch from Juli Mallett to src/file.c (I'm not
		sure yet if it breaks or fixes anything).
		Some minor fixes for SGI-IP22 (such as faked board revision
		numbers).
20040608	ll/sc should now fail if any unrelated load/store occurs.
		Minor changes to the configure script.
		Adding some ifdefs around code which is not often used
		(the mfhi/mflo delay, and the last_used TLB experimental
		code); this might cause a tiny speedup.
20040609	Minor fixes.
20040610	Various minor SGI fixes (64-bit ARCS stuff, progress on the
		CRIME/MACE interrupt system, and some other random things).
20040611	More crime/mace progress, and some more work on pckbc.
		KN5800 progress: adding a XMI->BI adapter device; a disk
		controller is detected (but it is just a dummy so far).
20040612	Adding "dev_unreadable", which simplifies making memory
		areas unreadable. (NetBSD on SGI-IP22 no longer detects
		non-existant hpc1 and hpc2 busses.)
		Implementing rudimentary support for IP22 "local0" and
		"local1" interrupts, and "mappable" local interrupts.
		Some progress on the WDSC SCSI controller on IP22, enough
		to let NetBSD get past the disk detection and enter
		userland!  :-)
		The zs (zilog serial) device now works well enough to let
		NetBSD/sgimips be interacted with on IP22. :-)  (Though
		it is very ugly and hardcoded.)
20040613	IP32 didn't work last night, because there were too many
		tick functions registered. That has been increased now.
		Trying out NetBSD/sgimips 2.0 beta kernels. There are some
		differences compared to 1.6.2, which I'm trying to solve.
		Interrupt fixes for IP32: _serial and _misc are different.
		Separation of IP22 (Full-house) and IP24 (Guiness).
20040614	Modifying the memory layout for IP20,22,24,26 (RAM is now
		offset by 128MB, leaving room for EISA registers and such),
		and moving around some code chunks. This is not well
		tested yet, but seems to work.
		Moving parts of the tiny translation cache, as suggested
		by Juli Mallett.  It seems that the speedup isn't as
		apparent as it was a few weeks ago, though. :-(
		Speedups due to not translating addresses into symbol
		names unless the symbol name is actually printed.
		Added support for loading old big-endian (Irix) ECOFF
		kernels (0x60 0x01 as the first two bytes).
20040615 (late)	Adding enough SGI IP20 (Indigo) support to let NetBSD 2.0
		enter userland :-)  No interrupt specifics are implemented
		yet, so it hangs while doing terminal output.
20040618	Experimenting with the WDSC SCSI controller for IP20,22,24.
20040620	Adding a program which converts SGI prom dumps from text
		capture to binary, and some hacks to try to make such an
		IP22 PROM to work better in the emulator.
20040621	Removing the Nintendo 64 emulation mode, as it is too
		uninteresting to support.
		Adding SCSI tape device support (read-only, so far).
		Fixing a bug which caused the cursor to be corrupted if new
		data was written to the framebuffer, but the cursor wasn't
		moved.
20040622(early)	Finally! Making progress on the SCSI tape stuff; when going
		past the end of a file, automagically switch to the beginning
		of the next.
20040622(late)	Trying to track down the last SCSI tape bugs.
		Removing _all_ dynamic binary translation code (bintrans),
		starting from scratch again.
20040623(early)	Performing a general code cleanup (comments, fixing stuff
		that led to compiler warnings, ...).
		Disabling MIPS16 support by default, and making it a
		configure time option to enable it (--mips16). This gives
		a few percent speed increase overall.
		Increasing performance by assuming that instruction loads
		(reading from memory) will be at the same page as the last
		load.  (Several percent speedup.)
		Moving the list of kernels that can be found on the net from
		README to doc/.
20040624	Finally! I found and fixed the bug which caused 'ps', 'top',
		'xclock', and other programs in NetBSD/pmax to behave weird.
		Increasing performance by a few percent by running as many
		instructions in a row as possible, before checking for
		hardware ticks.
		When booting from SCSI tapes on DECstation, the bootstring
		now contains 'tz' instead of 'rz'.
		Adding a second ARC machine mode, "Acer PICA-61", -A2.
		Disabling the support for "instruction delays" by default
		(it has to be enabled manually in misc.h now, but is never
		used anywhere anyway).
		Other minor optimizations (moving around stuff in the
		cpu struct in misc.h, and caching cpu->pc in cpu.c).
		Separating the tiny translation cache into two, one for
		code and one for data. This gives a few percent speed
		increase.
20040625(early)	I think now is a good time for a "feature freeze",
		to let the code stabilize and then make some kind of
		first release.
20040625(later)	Adding a -v (verbose) command line option. If -v is not
		specified, the emulator goes into -q (quiet) mode just before
		it starts to execute MIPS code.
20040627	The configure script now adds -fomit-frame-pointer to the
		compile flags if the $CC seems to be able to handle that.
		Found and fixed a serious interrupt bug in BT459 (Ultrix'
		behaviour required a hack, which was incorrect), so
		performance for machines using the PMAG-BA framebuffer is
		now improved.
		For X11 bitdepths other than 8 or 24, a warning message
		is printed at startup.
		A number of other minor fixes, optimizations, updated
		comments and so on.
		Adding a BUGS file, a list of known bugs.
		Adding a minimal man page, doc/mips64emul.1.
20040628	Hacks for faking the existance of a second level cache
		(ARCBIOS and other places).
		An important fix for dc7085: tx interrupts should happen
		before rx interrupts, not the other way around as it was
		before. (This speeds up NetBSD boot on DECstation, and
		fixes a bug which Ultrix triggered on heavy keyboard input.)
		A couple of other minor fixes.
		Framebuffer fix: there was a bug which caused the rightmost/
		bottom pixel to sometimes not be updated, when running in
		scaledown mode. This is now fixed.
		Adding a small program which removes "zero holes" from
		harddisk image files.
20040629	More minor fixes.
20040629(later)	Adding -A3 (NEC RISCstation 2200) (this is similar to
		the 2250 model that NetBSD/arc can already boot all the
		way into userland and be interacted with), and -A4
		(Deskstation Tyne).
		Some more minor fixes.
20040630	Adding support for 15 and 16 bits X11 framebuffers,
		and converting from XYPixmap to ZPixmap (this fixes the
		problem of updates appearing in "layers" on some X
		servers).
		The pixels in the mouse cursor (for BT459) are now colored
		as the emulated OS sets them, although no transparency
		masking is done on the edges of the cursor yet. (In plain
		English:  the mouse cursor is no longer just a white solid
		square, you can actually see the mouse cursor image
		on the white square.)

==============  RELEASE 0.1  ==============

20040701	The -j option now takes a name, the of the kernel as passed
		on to the bootloader.  ("netbsd" is the default name.)
		Adding support to load bootstrap code directly from a disk
		image, for DECstation.  Both NetBSD/pmax and Ultrix boot
		straight of a disk image now, with no need to supply a
		kernel filename on the command line.  (Ultrix still needs
		-j vmunix, though, to boot from /vmunix instead of /netbsd.)
20040702	Minor bugfix (some new untested code for X11 keypresses was
		incorrect).
20040702(later)	Adding an ugly hack for CDROMs in FreeBSD; if an fread() isn't
		done at a 2048-byte aligned offset, it will fail. The hack
		tries to read at 2048-byte aligned offsets and move around
		buffers to make it work.
		Adding video off (screen blanking) support to BT459.

==============  RELEASE 0.1.1  ==============

20040702(later)	Cleanup to remove compiler warnings (Compaq's cc, Solaris' cc,
		and gcc 3.3.3/3.3.4 in Linux), mostly by putting ULL on large
		numeric constants.
		Better support for scaledown of BT459 cursors, but still not
		color-averaging.
		Beginning the work on adding better memory latency support
		(instruction delays), enabled by the --delays configure option.
20040703	Modifications to the configure script so that a config.h file
		is created, containing things that were passed along as
		-Dxxx on each cc command line before.
		More work on instruction latency support; trying to separate
		the concepts of nr of cycles and nr of instructions.
20040704	Working on R2000/R3000 caches.
		Adding a '--caches' option to the configure script.
		Various small optimizations.
		R3000 caches finally work. (I know that there is at least one
		bug, regarding interrupt response.)
20040705	Working on the 'le' device, and on a generic (device
		independant) networking framework. le can transmit and receive
		packets, and the network framework fakes ARP responses from a
		fake gateway machine (at a fixed ip address, 10.0.0.254).
		Adding a '-c' command line option, which makes emulated_hz
		automatically adjust itself to the current number of emulated
		cycles per host CPU second (measured at regular intervals).
20040707	Removing the '-c' option again, and making it the default
		behaviour of the emulator to automatically adjust clock
		interrupts to runtime speed (as long as it is above 1 MHz).
		(This can be overridden by specifying a static clock rate with
		the -I option.)
		Updating the doc/ stuff a bit.
		Generalization of the DECstation bootblock loading, to work
		with Sprite/pmax. Lots of other minor modifications to make
		Sprite work, such as adding support for DECstation "jump table"
		PROM functions, in addition to the old callback functions.
		Sprite boots from a disk image, starting the kernel if the
		argument "-j vmsprite" is used, but it seems to not like the
		DBE exceptions caused by reading empty TURBOchannel slots. :-/
20040708	Minor changes and perhaps some tiny speed improvements.
		The Lance chip is (apparently) supposed to set the length of
		received packets to len+4. (I've not found this in any 
		documentation, but this is what NetBSD expects.) So now, ICMP
		echo replies work :-)  UDP works in the outgoing direction,
		in the incoming direction, tcpdump can see the packets but they
		seem to be ignored anyway. (Weird.)
		Adding a separate virtual-address-to-host-page translation
		cache, 1-entry for loads, 1-entry for stores. (For now, it
		only works on R4000 as there are conflicts with cache usage
		on R3000).
		Changing the lower clock speed bound from 1 MHz to 1.5 MHz.
20040709	Incoming UDP checksums were wrong, but are now set to zero
		and NetBSD inside the emulator now accepts the packets (eg.
		nameserver responses).  Host lookups and even tftp file
		transfers (using UDP) work now :-)
		Adding a section on Networking to the Technical documentation,
		and a preliminary NetBSD/pmax install instruction for network
		installs to the User documentation.
		Some updates to the man page.
20040709(later)	Fix to the TURBOchannel code to allow Sprite to get past the
		card detection. Seems to still work with Ultrix and NetBSD.
		This also makes Linux/DECstation properly recognize both the
		Lance controller and the SCSI controller. Linux 2.4.26 from
		Debian boots nicely in framebuffer mode :-)
20040710	Some bits in the KN02 CSR that were supposed to be readonly
		weren't. That has been fixed, and this allows Linux/DECstation
		to get past SCSI detection. :-)
		Minor updates to the ASC controller, which makes Linux and
		OpenBSD/pmax like the controller enough to be able to access
		SCSI devices. OpenBSD/pmax boots from a disk image for the
		first time. :-)  Linux detects SCSI disks, but I have no
		bootable Linux diskimage to test this with.
		Updating the doc/ to include instructions on how to install
		OpenBSD/pmax onto a disk image.
		Naively added a PMAGB-BA (1280x1024x8) in hopes that it would
		basically be a PMAG-BA (1024x864x8) in higher resolution,
		but it didn't work that way. I'll have to look into this later.
		Adding a -o option, useful for selecting '-s' (single user
		mode) during OpenBSD install and other things.
		After a lot of debugging, a serious bug related to the tiny
		cache was found; Linux just changes the ASID and returns when
		switching between processes in some occasions without actually
		_writing_ to the TLB, and I had forgotten to invalidate the
		tiny cache on such a change.
20040711(early)	I've been trying to repeat the OpenBSD install from yesterday,
		but appart from the first initial install (which was
		successful), I've only been able to do one more. Several
		attempts have failed with a filesystem panic in the middle
		of install. I'm not sure why.
20040711	I found the "bug": wget downloaded the simpleroot28.fs.gz file
		as read-only, and gunzip preserved those flags. Thus, OpenBSD's
		installer crashed as it didn't get its writes through to the
		disk.
		Parts of the 1280x1024x8 PMAGB-BA graphics card has been
		implemented, it works (unaccelerated) in NetBSD and OpenBSD,
		but Ultrix does not seem to like it.
		Cleaned up the BT459 cursor offset stuff a bit.
		Trying to make the emulated mouse coordinates follow the host's
		mouse' coordinates (for lk201, DECstation), by
		"de-accelerating" the data sent to the emulated OS.
20040711(later)	Fix so that Sprite detects the PMAG-BA correctly.
		Adding some stuff about NFS via UDP to the documentation.
		Fixed the 'update flag' for seconds, so now Sprite doesn't
		crash because of timer-related issues anymore.
		Fixing KN02 interrupt masks a bit more, to make Sprite not
		crash. Sprite now runs quite well.
20040712	Working on IP/UDP fragementation issues. Incoming UDP packets
		from the outside world can now be broken up into fragments
		for the guest OS. (This allows, for example, OpenBSD/pmax to
		be installed via nfs.)  Outgoing fragmented packets are NOT
		yet handled.
		Linux doesn't use 64-bit file offsets by default, which is
		needed when using large disk images (more than 2GB), so the
		configure script has now been modified to add necessary
		compiler flags for Linux.
20040713	Trying out some minor optimizations.
		Refreshing the UDP implementation in src/net.c a little.
20040714	Updating the documentation a little on how to experiment
		with a Debian Linux install kernel for DECstations.
		A 'mini.iso' Linux image for DECstation has different fields
		at offsets 0x10 and 0x14, so I'm guessing that the first is
		the load address and the second is the initial PC value.
		Hopefully this doesn't break anything.
		Some initial TCP hacks, but not much is working yet.
		Some updates for IP30:  The load/store 1-entry cache didn't
		work too well with IP30 memory, so it's only turned on for
		"MMU4K" now. (This needs to be fixed some better way.)
		Adding a hack which allows Linux/Octane to use ARC write()
		and getchild() on IP30. Linux uses ARCBIOS_SPB_SIGNATURE as a
		64-bit field, it was 32-bit before.
		Making ugly hacks to the arcbios emulation to semi-support
		64-bit equivalents of 32-bit structures.
20040716	Minor fixes to the configure script (and a few other places)
		to make the sources compile out-of-the-box on HP-UX (ia64
		and HPPA), old OpenBSD/pmax (inside the emulator itself), and
		Tru64 (OSF/1) on Alpha.
		A couple of other minor fixes.
20040717	A little TCP progress; OpenBSD/pmax likes my SYN+ACK replies,
		and tries to send out data, but NetBSD/pmax just drops the
		SYN+ACK packets.
		Trial-and-error led me to change the 64-bit ARCS component
		struct again (Linux/IP30 likes it now). I'm not sure about all 
		of the offsets yet, but some things seem to work.
		More 64-bit ARCS updates (memory descriptors etc).
		Better memory offset fix for IP30, similar to how I did it for
		IP22 etc. (Hopefully this doesn't break anything else.)
		Adding a MardiGras graphics controller skeleton for SGI-IP30
		(dev_sgi_mardigras.c).
		Thanks to Stanislaw Skowronek for dual-licensing mgras.h.
		Finally rewrote get_symbol_name() to O(log n) instead of O(n)
		(Stanislaw's Linux kernel had so many symbols that tracing
		with the old get_symbol_name() was unbareably slow).
		Removing all of the experimental tlbmod tag optimization code
		(the 1-entry load/store cache), as it causes more trouble than
		the performance gain was worth.
20040718	The MardiGras device works well enough to let Linux draw the
		SGI logo and output text.
		A bunch of other minor changes.
20040719	Trying to move out all of the instruction_trace stuff from the
		main cpu loop (for two reasons: a little performance gain,
		and to make it easier to add a GUI later on).
20040720	Finally found and fixed the ethernet/tcp bug. The hardware
		address is comprised of 6 bytes, where the _first_ byte should
		have a zero as the lowest bit, not the last byte. (This causes
		NetBSD and Linux running in the emulator to accept my SYN+ACK
		packets.)
		Getting the first nameserver address from /etc/resolv.conf.
		(This is not used yet, but could be useful if/when I add
		internal DHCP support.)
		Working more on the TCP stuff; TCP seems to be almost working,
		the only immediate problem left is that the guest OS gets
		stuck in the closing and last-ack states, when it shouldn't.
		It is now possible to install NetBSD and OpenBSD via ftp. :-)
20040721	Trying to fix the last-ack bug, by sending an RST after the
		end of a connection. (Probably not a correct fix, but seems
		to work?)
		Adding a my_fseek() function, which works like fseek() but
		with off_t instead of long, so that large disk images can
		be used on systems where long is 32 bits.
20040722	Trying to fix some more TCP related bugs.
20040725	Changing the inlined asm statement in bintrans_alpha.c into
		a call to a hardcoded array of bytes that do the same thing
		(an instruction cache invalidation). This allows the same
		invalidation code to be used regardless of compiler.
		Some other minor changes.
20040726	Minor updates. The configure script is now more verbose.
		A Debian/IP22 Linux tftp boot kernel requires ARCS memory to
		be FreeMemory, not FreeContiguous. (This should still work with
		other SGI and ARC OSes.)
		Fix for ARCS write(), so it returns good write count and
		success result (0).
		Some hacks to the IP22 memory controller, to fake 72MB RAM
		in bank 0.
		The IP22 Debian kernel reaches userland (ramdisk) when run
		with -G24 -M72 -CR4400, if a special hack is done to the
		zs device.
20040730	Removing mgras.h, as I'm not sure a file dual-licensed this way
		would work. (Dual-licensing as two separate files would work
		though.)
		Preparing for the upcoming release (0.2).
20040801	Fixing the 512 vs 2048 cdrom sector size bug; I hadn't 
		implemented the mode select SCSI command. (It still isn't
		really implemented.)
		A bug which crashes the emulator is triggered when run with
		new NetBSD 2.0_BETA snapshots on a Linux/i386 host. I'm not
		sure why.
		UDP packets sent to the gateway (at 10.0.0.254) are now
		forwarded to the machine that the host uses as its nameserver.
		Some other minor fixes.

==============  RELEASE 0.2  ==============

20040803	A post-3.5 OpenBSD/sgimips kernel snapshot with ramdisk seems
		to boot fine in the emulator, all the way to userland, and
		can be interacted with.
		Adding a -y option, used to set how many (random) instructions
		to run from each CPU at max (useful for SMP instruction
		interleave experiments).
		Importing a 8x16 console font from FreeBSD (vt220l.816).
		Adding a skeleton for a 80x25 text console device (dev_vga),
		useful for some ARC modes. (Character output is possible, but
		no cursor yet.)
		Adding a dev_zero device (returns zeroes on read).
		OpenBSD/arc 2.3 can get all the way to userland with -A4 (if
		the wdc devices are commented out) but bugs out there, probably
		because of interrupt issues.
		Adding a -A5 ARC emulation mode (Microsoft-Jazz, "MIPS Magnum")
		which NetBSD seems to like. No interrupt specifics yet, so
		it hangs while waiting for SCSI.
20040804	Some dev_mp updates.
		The -y switch has to do with number of cycles, not number
		of instructions; variable names have been changed to reflect
		this.
20040805	Minor updates. Adding some more CPU types/names, but they
		are probably bogus.
		Adding a MeshCube emulation mode. Just a skeleton so far, but
		enough to let a Linux kernel print some boot messages.
		Adding the 'deret' instruction.
20040806	Adding include/impactsr-bsd.h (a newer version of what was in
		mgras.h before, and this time with only a BSD-style license),
		but it is not used yet.
20040810	Some Au1500 updates.
20040811	Adding the 'clz', 'clo', 'dclz', and 'dclo' special2 (MIPS32
		and MIPS64) instructions.
		More Au1500 updates.
20040812	Using fseeko(), when it is available.
		Other minor updates.
		Adding a NetGear WG602 emulation mode skeleton (-g); after
		a lot of trial and error, a Linux kernel (WG602_V1715.img)
		gets all the way to userland, but hangs there.
20040818	Adding signal handlers to better cope with CTRL-Z and CTRL-C.
		Adding a simple interactive single-step debugger which is
		activated by CTRL-C. (Commands so far: continue, dump, help,
		itrace, quit, registers, step, trace, version)
20040818(later)	Adding a 'tlbdump' debugger command, and some other minor
		fixes.
20040819	Minor updates. Adding an 'unassemble' debugger command.
20040822	Minor updates to the regression testing framework.
20040824	Minor updates based on feedback from Alec Voropay
		(configure script updates for Cygwin and documentation).
20040826	Minor updates.
		Adding a cursor to the VGA text console device.
		Changing all old 11:22:..55:66 ethernet macs to 10:20..60,
		still hardcoded though.
20040828	Minor updates.
20040829	mips64emul is 1 year old today :-)
20040901	tests/README now lists "all" MIPS opcodes. This list should
		be updated whenever a new opcode is implemented, or when a
		regression test is added. (A combination of instructions from
		the TX79 manual, the GNU assembler, and the MIPS64 manual).
		Hopefully I haven't missed too many.
		Adding a section on regression testing to doc/technical.html.
20040902	Finally beginning the work on separating out the stuff from
		main.c into a "struct emul". Very time-consuming.
		Some minor fixes for LL/SC on R10000.
20040905	Moving more stuff from main.c into struct emul. Unfortunately,
		it seems that this causes a slowdown of the emulator.
		Userland emulation is now only used if --userland is used
		when running configure.
		Modifying src/symbol.c to not use global variables.
20040906	Minor update.
20040914	Using $COPTIM when detecting which compiler flags to use in
		the configure script. (This makes sure that combinations of
		flags should work.)
		There'll probably be a 0.2.1 release some time soon, but I'll
		do some more clean-up first.
		Minor update to the detection of ECOFF files, but I don't like
		it; sometimes the endianness of the magic value seems to be
		swapped, but it doesn't have to do with endianness of the
		actual data?
20040916	Minor updates. Adding an Example section to the manpage, but
		as I'm not really familiar with manpage formatting, it will
		need to be rewritten later.
20040917	Finally making the coprocessor instructions disassemblable
		even when not running.
		Doing some testing for the 0.2.1 release.

==============  RELEASE 0.2.1  ==============

20040923	Updating the documentation about how to (try to) install
		Debian GNU/Linux.
20040924	Some more updates to the documentation.
20040925	Adding overflow stuff to 'add' and 'sub'.
20040926	Minor updates: possibly a fix to 'sltiu' (the imm value
		should be treated as signed, and then converted to unsigned,
		according to the MIPS64 manual), and removing the
		'last_was_rfe' stuff (again).
		OpenBSD/arc used speed-hack jumps with other deltas than just
		+/- 1 (it used -3 iirc), so the jump speedhack should now
		support any delta. Also adding bgtzl and blezl as possible
		instructions for the speed-hack jumps. (This needs to be
		tested more.)
20040928	Minor updates. Some ARC stuff ("arcdiag" runs now).
		cpu_register_dump() now also dumps coprocessor registers.
20040929	More ARC updates. Making the code look a tiny bit nicer
		than before. "arcdiag.ip22" works for -G22 (SGI-IP22).
		Apparently the overflow support in the 'add' instruction
		was incorrect, so I disabled it.
20041002	Trying to install Ultrix in the emulator, but the installer
		crashes; found (and fixed) the bug rather quickly: the "fix"
		I implemented a few days ago for the 'sub' instruction
		(according to the MIPS64 manual) caused the bug.
20041004	Changing the behaviour of the -j command line option. The
		default is now "" (or taken from the last filename given on
		the command line), not "netbsd". In practice, this doesn't
		change much, except that -j netbsd.pmax is no longer needed
		when installing NetBSD.
		Adding a COMPILE_DATE string to config.h.
20041007	Adding a NEC RISCserver 4200 model (-A6), and some more
		updates to the ARC component tree generator.
20041008	The 'll' instruction should be signed, not unsigned as before.
		This (and some other minor fixes) causes Irix on SGI-IP32 (O2)
		to actually boot far enough to print its first boot messages :)
		Working on some new dynamic bintrans code. Enough is now
		implemented so that the 'nop' instruction is translated
		and there is support for Alpha, i386 and UltraSparc backends,
		but performance is about 50% worse than when running without
		bintrans. (This is as expected, though.)
20041009	Minor updates to the documentation.
		Using mprotect() to make sure that the code created dynamically
		by the bintrans subsystem is allowed to be executed. (This
		affects newer OpenBSD systems, and possibly others.)
		The translated code chunks now only get one argument passed to
		them, the (struct cpu *) of the current cpu.
20041010	Hack to dev_le.c which makes Ultrix accept the initialization
		of the LANCE controller. (This goes against the LANCE
		documentation though.)
		In src/net.c, a fix for Ultrix (which seems to send larger
		ethernet packets than the actual TCP/IP contents). The hack to
		dev_le.c and this fix is enough to let Ultrix access the
		Internet.
		For DECstation, when booting without a disk image (or when
		"-O" is used on the command line), use "tftp" instead of "rzX"
		for the boot string.
20041011	Adding cache size variables to the emul struct, so that these
		can be set on a per-machine basis (or potentially manually
		on the command line).
20041012	Mach/PMAX now passes the LK201 keyboard self-test (although
		the keyboard ID is still bogus).
20041013	Minor updates.
		Hacks to the ASC SCSI controller for Mach/PMAX, hopefully this
		will not break support for other OSes.
20041014	Minor fix to src/emul.c for reading bootblocks at the end of
		a disk or cdrom image (thanks to Alexandru Lazar for making me
		aware of this).
		Adding "gets()" to src/dec_prom.c.
		Working a bit on ARC stuff. Importing pica.h from NetBSD.
		Minor updates to the ARC component tree for PICA-61.
		Adding a dev_jazz.c (mostly for PICA-61).
		Renaming dev_jazz.c into dev_pica.c. Working on PICA timer
		and interrupt specifics.
20041016	Adding some dummy entries to lk201.c to reduce debug output.
		Some bintrans updates (don't run in delay slots or nullified
		slots, read directly from host memory and not via memory_rw(),
		try mmap() before malloc() at startup, and many other minor
		updates).
		Adding bintrans_mips.c for 64-bit MIPS hosts, but it is not
		used yet.
20041017	Minor updates.
20041018	Update to dev_mc146818 to allow Mach to boot a bit further.
		The "hardware random" in dev_mp.c now returns up to 64 bits
		of random.
20041019	Minor updates to the way cache sizes are used throughout the
		code. Should be mostly ok for R[234]x00.
		src/file.c now loads files using NO_EXCEPTIONS. Whether this
		is good or bad, I'm not sure.
20041020	Adding a Linksys WRT54G emulation skeleton (-H).
20041021	Minor updates.
		R1[024]000 cache size bits in the config register should now
		be ok.
		Trying to make dev_asc.c work better with PICA.
		More work on PICA interrupts (but they are broken now).
20041022	Generalizing the dev_vga text console device so that it can be
		used in other resolutions than just 80x25. Works with
		OpenBSD/arc.
		emul->boot_string_argument is now empty by default (except
		for DECstation modes, where it is "-a").
		Speedup of dev_ram by using mmap() instead of malloc().
		Using mmap() in memory.c as well, which reduces memory usage
		when emulating large memory sizes if the memory isn't actually
		written to.
20041023	Minor updates.
20041024	Updates to the PC-style keyboard controller, used by PICA.
		Updates to the PICA (Jazz) interrupt system. Both NetBSD/arc
		and OpenBSD/arc now reach userland with PICA emulation, and
		can be interacted with (there are a few programs on the
		INSTALL kernel ramdisks). In the case of OpenBSD, a VGA text
		console and PC-style keyboard controller is used, NetBSD
		runs on serial console.
		Adding a framework for DMA transfer for the ASC SCSI
		controller.
		Implementing a R4030 DMA controller for PICA, enough to let
		OpenBSD/arc and NetBSD/arc be installed on an emulated
		Pica. :-)
		Updates to the documentation.
20041025	Working on ISA interrupts for PICA.
		Adding an Olivetti M700 emulation mode (-A7).
		Better separation of PICA and M700 stuff (which I accidentally
		mixed up before, I thought the M700 Linux kernel would 
		also work on PICA because it almost booted).
		Writing a skeleton G364 framebuffer for M700, enough to show
		the Linux penguin and some text, although scrolling isn't
		correctly implemented yet.
		Adding a dummy SONIC (ethernet) device, dev_sn, for PICA.
		Fixing the passing of OSLOADOPTIONS for ARC, the default is
		now "-aN" which works fine with OpenBSD/arc and NetBSD/arc.
20041027	Minor updates.
20041029	Adding a Sony NeWS "newsmips" emulation mode skeleton (-f).
		Found and fixed a bug which prevented Linux/IP32 from running
		(the speed-hack-jump-optimization fix I made a few weeks ago
		was buggy).
		Adding the trunc.w.fmt and trunc.l.fmt instructions, although
		the are probably not really tested yet.
		Changes to how floating point values are handled in
		src/coproc.c, but right now it is probably very unstable.
20041101	I had accidentally removed the instructions on how to install
		Ultrix from doc/index.html. They are back now.
		Adding a -Z option, which makes it easier to run dual- or
		tripple-head with Ultrix. (Default nr of graphics cards
		without -X is 0, with -X is 1.)
		Minor update which makes it possible to switch to the left
		monitor when running tripple-head, not just right as before.
		When using more than one framebuffer window, and the host's
		mouse cursor is in a different window than the emulated mouse
		cursor, the emulated mouse will now try to move "very far",
		so that it in practice changes screen.
		Running Ultrix with dual- and tripple-head now feels really
		great.
20041101(later)	OpenBSD/arc and Linux/Olivetti-M700 don't both work at the
		same time with the speed-hack stuff. So, from now on, you
		need to add -J for Linux, and add nothing for openbsd.
20041102	Minor update for OSF/1 V4.0 (include sys/time.h in src/net.c
		and add -D_POSIX_PII_SOCKET to the C compiler flags).
20041103	Minor updates for the release.
		For some reason, Mach/PMAX caused the emulator to bug out on
		SunOS/sparc64 (compiled in 64-bit mode); a minor update/hack
		to dev_asc fixed this.

==============  RELEASE 0.2.2  ==============

20041103	Minor updates.
20041104	Minor updates.
20041105	Running with different framebuffer windows on different X11
		displays works now (even with different bit depths and
		endiannesses on different displays). A new command line option
		(-z) adds DISPLAYs that should be used.
		Update regarding how DECstation BT459 cursors are used;
		transparency :-) and some other bug fixes.
20041106	More bt459 updates. The cursor color seems to be correct for
		NetBSD, OpenBSD, Ultrix, and Sprite.
		Some minor bintrans updates (redesigning some things).
20041107	More bintrans updates (probably broken for non-Alpha targets).
		Moving doc/mips64emul.1 to man/.
20041108	Some updates.
20041109	More updates. Bintrans experiments mostly.
20041110	Some minor bintrans updates.
20041111	Minor updates.
20041112	A little rewrite of the bintrans system (again :-), this time
		a lot more naďve and non-optimizing, in order to support delay
		slots in a much simpler way.
		Ultrix 4.5 boots into a usable desktop on my home machine in
		3min 28sec, compared to 6-8 minutes without bintrans.
20041113	Some minor bintrans updates.
20041114	More bintrans updates. Ultrix now boots in exactly 3 minutes
		on my home machine.
20041115	More bintrans updates.
20041116	Bintrans updates.
20041117	Working on dev_dec_ioasic and related issues.
		Adding support for letting translated code access devices in
		some cases (such as framebuffers).
20041118	Moving some MIPS registers into Alpha registers, which gives
		a speed improvement.
		Beginning to write an i386 bintrans backend. Skeleton stuff
		works, lui, jr/jalr, addiu/daddiu/andi/ori/xori, j/jal,
		addu/daddu/subu/xor/or/nor/and.
20041119	dsubu/sll/srl/sra, rfe,mfc0,dmfc0, beq,bne, delayed branches.
		Some load/store (but not for bigendian emulation yet.)
		Time to reach Ultrix 4.5's graphical login on a 2.8 GHz Xeon
		host is now down to 20 seconds!
		Adding bgez, bltz, bgtz, and blez to the i386 backend.
20041120	Minor updates (bintrans related mostly).
		Time to reach Ultrix login on the Xeon is now 11 seconds.
		Adding 'mult', 'multu' and a some parts of mtc0 to the Alpha
		backend.
		The transparency updates to the X11 cursor support made the
		OpenBSD/arc cursor disappear; that has been fixed now.
		Unfortunately, something with Acer Pica emulation is broken
		when bintrans is enabled.
20041121	Making tlbwr, tlbwi, tlbp, tlbr callable directly from
		translated code.
		Adding sltiu, slti, slt, and sltu to the i386 backend.
20041122	More bintrans updates.
		With the Alpha backend, the status and entryhi registers
		can (in some cases) be written without exiting to the main
		loop. Ultrix boot time until a usable desktop is reached
		is about 1 min 35 seconds on the 533 MHz pca56.
		Adding srlv, srav, and sllv to the i386 backend.
20041123	Removing the special handling of high physical addresses for
		DECstation emulation from the main memory handling code, and
		replacing it with a mirror device instead. (This results in
		a tiny increase in performance, and cleaner code.)
		Various minor updates.
20041124	Ripping out _all_ bintrans load/store code, because I have
		a new idea I'd like to try out.
		A total rewrite of the load/store system. It works when
		emulating 32-bit MIPS, but not for 64-bit code yet.
		Some minor updates to the dev_fb, but no speed improvement.
		Making the 'le' ethernet device' SRAM work with bintrans.
20041125	Various updates.
		Adding a little "bootup logo" to the framebuffer.
		There is now one translate_address() for R3000-style MMUs,
		and one for the other types. (This gives a tiny speed
		improvement.)
20041126	Minor updates, bintrans.
		Fixing the bug which caused OpenBSD/arc (R4000) to bug out;
		it was introduced between the 7:th and 10:th of November
		when moving up the check for interrupts to above the code
		which runs bintrans code, in src/cpu.c.
		Adding movn and movz to the Alpha bintrans backend.
20041127	Various minor updates.
20041128	Making the R2000/R3000 caches work with bintrans, even in
		isolated mode. (Not true cache emulation, but it works with
		NetBSD/pmax, OpenBSD/pmax, and Ultrix.)
		Making the default cache size for R3000 4KB instr, 4 KB data;
		a real R3000 could have 64KB each, but emulated OSes run
		faster when they think the cache is smaller :-)
		Updates to the i386 backend: the nr of executed instructions
		is now placed in ebp at all times, and some support for
		mtc0 similar to how it is done in the Alpha backend has been
		added. A full NetBSD/pmax 1.6.2 install can now be done in
		5 minutes 35 seconds, on a 2.8 GHz Xeon host (with -bD2 -M20).
		Adding mult and multu to the i386 bintrans backend.
		Reducing the number of malloc/free calls used by the
		diskimage subsystem.
20041129	Minor updates to the Alpha bintrans backend.
20041130	Trying to fix the bug which prevents Linux from working
		with bintrans. It _seems_ to work now. (Pages could in some
		cases be written to after they were translated, but that
		has been fixed now.)
		A couple of other minor fixes.
		Minor updates to the Alpha backend (directly using Alpha
		registers in some cases, instead of loading into temporaries).
		Updates to the i386 backend (special hacks for 32-bit
		MIPS emulation, which are fast on i386, for example only
		updating half of the pc register).
20041201	More updates to the i386 backend, similar to those yesterday.
		Preparing for release 0.2.3.
		Adding a generic load/store mechanism, which is used when the
		32-bit optimized version cannot be used (for R4000 etc).

==============  RELEASE 0.2.3  ==============

20041202	If ALWAYS_SIGNEXTEND_32 is defined in misc.h, and an
		incorrectly extended register is detected, the emulator now
		exits instead of continues.
		Removing the LAST_USED_TLB_EXPERIMENT stuff.
		Minor updates to work better with Windows NT's ARCINST.EXE;
		printing 0x9b via arcbios becomes ESC + '[', and the ARC
		memory descriptor stuff has been generalized a bit more.
		Adding arbios hacks for Open(), Seek(), GetRelativeTime(),
		and Read() to allow WinNT's SETUPLDR to read the filesystem
		on the diskimage used for booting.
20041203	Adding a terminal emulation layer which converts arcbios
		stdout writes to "VGA text console" cell characters. Seems
		to work with Windows NT and arcdiag.
		Adding a 8x8 font to dev_vga.
		Adding more ARC components to the component tree (for PICA
		emulation).
20041204	Minor updates.
		More updates to dev_vga. Adding a 8x10 font.
		Adding a hack so that the framebuffer logo is visible at
		startup with dev_vga. (It disappears at the first scroll.)
		A minor fix for a bug which was triggered when running
		dual- or tripple-head, on 2 or 3 actual X11 displays.
20041205	Fixing a bintrans bug.
		Some other minor updates (some of them bintrans related).
20041206	Moving the web page to http://gavare.se.
		Adding a hack for mmap() which supports anonymous mapping
		using /dev/zero, but not using MAP_ANON{,YMOUS}.
		Separating out opcodes.h, cop0.h, and cpu_types.h from misc.h.
20041207	Minor bintrans update. (In some cases, it isn't necessary
		to return to the main loop, when translating from a new page.)
		Some other minor i386 bintrans backend optimizations.
		And some other minor updates.
		i386 backend update: the lowest 32 bits of the pc register
		are now placed in an i386 register.
20041208	Adding GetConfigurationData() and some support for config
		data, to src/arcbios.c.
		Adding a bogus 0xbd SCSI command (used by Windows NT). It is
		not listed in http://www.danbbs.dk/~dino/SCSI/SCSI2-D.html.
		If the framebuffer cursor contains more than 1 color, then
		the host's X11 cursor disappears. (Nice for DECstation
		emulation with emulated X.)
		For ARC and SGI emulation, if an exception occurs before an
		exception handler is installed, the emulator now exits
		nicely (as suggested by Alec Voropay).
		A couple of minor updates to the ARCBIOS emulation subsystem.
		The single step debugger is now automatically entered when
		all CPUs have stopped running, unless there was a clean
		shutdown of some kind (PROM halt() call, or similar).
		Adding a -V option for starting up in a paused state, into
		the single-step debugger.
		Adding a note about 'mmon' to the documentation
		(http://www.brouhaha.com/~eric/software/mmon/).
20041209	Fixes to devices/console.c which makes cursor keys and such
		a bit more reliable.
		ARCBIOS hack/update which creates memory descriptors _after_
		loading the executable. (Seems to work with OpenBSD/arc,
		NetBSD/arc, arcdiag, IRIX, NetBSD/sgimips, OpenBSD/sgi, and
		some Windows NT executables.)
		ARCBIOS support for cursor keys (ESC + '[' ==> 0x9b).
		A bintrans update (for 32-bit emulation) which speeds up
		jumps between pages, if code is already translated.
		Changing the default bintrans cache from 20 to 24 MB.
20041210	Optimizing unaligned load/stores a little bit in src/cpu.c.
		Omiting the check for nr of executed bintrans instructions
		on some forward jumps.
		Adding the 'syscall' and 'break' instructions to the
		bintrans backends.
		Allowing more bits of the status register to be written to
		from within inside translated code, on R3000.
		Getting rid of the final pixel when hiding the host's mouse
		cursor.
		store_buf() now copies data 8 or 4 bytes at a time, when
		possible. (This speeds up emulated ROM disk reads, etc.)
		Tiny bug fix: coprocessor unusable exceptions are now also
		generated (for coproc 1..3) even when in kernel mode, if the
		coprocessors are not enabled. This allows a Debian installation
		to proceed further than before. (It's still very unstable,
		though.)
20041212	Updating doc/index.html with better Debian installation
		instructions.
		If SLOWSERIALINTERRUPTS is defined at compile time, interrupts
		from the dc7085 device will not come as often as they normally
		do. This makes Debian seem more stable.
		Decreasing the bintrans cache to 20 MB again.
		Updating some files in preparation for a 0.2.4 release.
20041213	Updating the docs on how to install NetBSD 2.0/pmax, and also
		some updates to the section on installing Debian.
		32-bit bintrans backend optimization: don't inline large
		chunks of code, such as general jumps.
20041214	Minor fix for coproc unusable for R4000 (it's the PC that,
		matters, not the KSU bits).
		Separating out the debugger from emul.c into debugger.c.
		Rewriting parts of the debugger.
		Removing the -U command line option, as it wasn't really
		useful. Also removing the -P option.
		Renaming all instances of dumppoint to breakpoint, as that
		is what it really is.
		When a breakpoint is reached, the single-step debugger is
		entered, instead of just turning on instruction trace.
		Adding a 'breakpoints' debugger command.
		Better fix for coproc unusable on R4000: the KSU bits matter,
		but the ERL and EXL bits override that.
		Fix which allows Debian to boot directly from a disk image
		(with DELO). (It reads multiple separate areas from disk.)
		Update to the SLOWSERIALINTERRUPTS stuff, making it even
		slower.
		Fixes based on feedback from Alec Voropay (-Q with ARC
		emulation skips the setup of arcbios data structures in
		memory, and no sign-extension _after_ writing a 32-bit
		value to a 64-bit coproc 0 register).
		Adding a 'devices' command to the debugger.
		The 'registers' and 'tlbdump' commands now take an optional
		argument (a cpu id).
		Adding rudimentary tab-completion and cursor key stuff to
		debugger_readline().
		Adding some more debugger commands: 'bintrans' and 'machine'.
20041215	Adding a 'devstate' command; implementing a skeleton for a
		state function for the bt459 device.
		Implementing yet another variant of the SLOWSERIALINTERRUPTS
		stuff.
		Implementing more of the different exception offsets (taking
		CAUSE_IV and STATUS_BEV into account).
		hpc_bootinfo should now be correctly filled on big-endian
		hosts.
		Always shift left by 12, not by pageshift, to get physical
		addresses on MMU4K etc. (Thanks to Alec Voropay for noticing
		this.)
20041216	The KN02's CSR can now be read from bintranslated code.
		Adding a dummy dev_sgi_mec.
20041217	The default framebuffer and model settings for -F (hpcmips)
		should now be almost like Cassiopeia E-500.
		Changing -DSLOWSERIALINTERRUPTS into a command line option, -U.
20041218	Continuing a little bit on the mec controller.
		Removing lots of #include <math.h> that weren't really used.
20041219	Fixing stuff that broke because of the pageshift bugfix.
		Adding an argument to the s (step) debugger command, for doing
		more than 1 step at a time.
		ARCBIOS components representing disk images are now created
		to actually match the disk images in use, and some other
		arcbios-related updates; adding a dummy GetComponent().
		Adding a 'lookup' command to the debugger, for symbol lookups.
		Adding a "NEC Express RISCserver" mode (NEC-R96, -A8).
		Adding a dummy ARCBIOS GetFileInformation(), GetTime(), and
		SetEnvironmentVariable().
20041220	Improved command line editing (including command history)
		in the debugger.
		Separating some more .h files from each other, and fixing
		some Solaris compiler warnings.
20041221	Minor updates.
20041222	Minor updates; hpcmips (BE300, VR41xx) stuff.
		The 'register' debugger command is now 'reg', and it can
		be used to modify registers, not just read them.
		The syntax for hpcmips (-F) is now -F xx, where xx is a
		machine model identifier. (1 = BE300.)
20041223	Some really minor updates.
20041226	Minor updates to doc/index.html (NetBSD 1.6.2 -> 2.0, and
		some other rearrangements).
		Many updates to the debugger (better register manipulation,
		breakpoint manipulation, and other updates).
		Fix to dev_cons.c to allow the regression tests to work again.
		The configure script now tries to detect the presence of a
		MIPS cross compiler. (Used by "make regtest".)
		Regression tests are now run both with and without bintrans.
20041227	Some hacks to the VR41xx code to allow Linux for BE300 to
		get far enough to show the penguin on the framebuffer.
20041228	Merging dev_kn01_csr.c and dev_vdac.c into dev_kn01.c.
20041229	Various updates to the debugger (nicer tlb output and other
		things).
		Some floating point fixes in src/coproc.c (mov is not
		an arithmetic instruction), and in src/cpu.c (ldcX/sdcX in
		32-bit mode uses register pairs).
		'-O' now also affects the bootstring for SGI and ARC emulation.
		Bintrans updates (slightly faster 32-bit load/store on alpha).
		Updates to the i386 backend too, but no real speed improvement.
20041230	Cleaning up parts of the 64-bit virtual-to-physical code for
		R10000, and per-machine default TLB entries can now be set
		for SGI and ARC machines.
		Fix: SGI-IP27 is ARC64, not ARCS.
20050101	Minor updates.
20050102	Minor updates.
		Fixing a 32-bit 'addu' bug in the bintrans backends.
		Allowing fast load/stores even in 64-bit bintrans mode, if
		the top 32 bits are either 0x00000000 or 0xffffffff (for Alpha
		only).
		Re-enabling ctc0/cfc0 (but what do they do?).
		Adding beql, bnel, blezl, and bgtzl to the Alpha backend.
20050103	Adding fast 32-bit load/store for 64-bit mode emulation to
		the i386 backend too (similar to the Alpha code). Not really
		tested yet, though.
		Adding an incomplete regression test case for lwl/lwr/ldl/ldr.
		Playing around with bintranslated lwl and lwr for Alpha.
20040104	Changing many occurances of pica to jazz.
		Various other updates.
20050105	Fixing some more bintrans bugs (both Alpha and i386).
		Unaligned stores that cause tlb refill exceptions should now
		cause TLBS exceptions, not TLBL.
		Adding experimental swl and swr to the Alpha backend.
		Adding lwl, lwr, swl, and swr to the i386 backend.
20050106	Adding another hpcmips model (Casio E-105, -F2), and doing
		some updates to the VR41xx code. NetBSD/hpcmips prints some
		boot messages.
20050108	Minor updates.
20050109	dev_dec5500_ioboard.c and dev_sgec.c => dev_kn220.c.
		dev_crime.c, _mace.c, and _macepci.c => dev_sgi_ip32.c.
		Also adding dev_sgi_mec, _ust, and _mte into dev_sgi_ip32.c.
		A slight license change. Still revised BSD-style, though.
		memory_v2p.c is now included separately for MMU10K and
		MMU8K.
		Fixing a NS16550 bug, triggered by NetBSD 2.0, but not 1.6.2.
		Refreshing the UltraSPARC bintrans backend skeleton.
		Merging dev_decbi, _deccca, and _decxmi into dev_dec5800.c.
		Sparc backend instructions done so far: mthi/mtlo/mfhi/mflo,
		lui, addu, daddu, subu, dsubu, and, or, nor, xor, sll, dsll,
		srl, and sra.
		Adding more sparc backend instructions: addiu, daddiu, xori,
		ori, andi, srlv, srav, sllv, slt, sltu, slti, sltiu.
20050110	Changing the default bintrans cache to 16 MB, and some other
		minor updates.
		Adding div and divu to the i386 backend (but not Alpha yet).
		More work on ARCBIOS emulation.
		Trying to find a bug which affects Linux on Playstation 2 in
		bintrans mode.
20050111	Moving around some Playstation 2 stuff, but I haven't found
		the bug yet. It is triggered by load/stores.
		More ARCBIOS updates, enough to let Windows NT partition
		disks in some rudimentary fashion.
20050112	Testing for release 0.2.4.
		Fixes to suppress compiler warnings.

==============  RELEASE 0.2.4  ==============

20050113	Minor updates.
20050114	Fix to the Alpha bintrans backend to allow compilation with
		old versions of gcc (2.95.4).

==============  RELEASE 0.2.4.1  ==============

20050115	Various updates and fixes: some IP32 stuff, the debugger,
		ns16550 loopback tx isn't transmitted out anymore, ...
		Removing old/broken R10000 cache hacks, which weren't really
		used.
20050116	Minor updates to the documentation on using PROM images.
		Adding ARCBIOS function 0x100 (used by IRIX when returning
		from main, but undocumented).
		MC146818 updates (mostly SGI-related).
		ARCS64 updates (testing with an OpenBSD snapshot in IP27
		mode). This causes Linux/IP30 to not work. Maybe IP27 and
		IP30 differ, even though both are 64-bit?
		Removing some nonsensical ARCS64 code from machine.c.
		Better handling of 128MB and 512MB memory offsets used by
		various SGI models.
		Trying to revert the ARCS64 changes (OpenBSD/sgi does
		seem to be aware of 64-bit vs 32-bit data structures in
		_some_ places, but not all), to make Linux/IP30 work again.
		Adding "power off" capability to the RTC, as used on IP32
		(and possibly IP30 and others).
		Some IP30 updates.
20050117	Debugger updates (symbolic register names instead of just rX,
		and using %08x instead of %016llx when emulating 32-bit CPUs
		in more places than before).
		Removing the dummy sgi_nasid and sgi_cpuinfo devices.
		Also using symbolic names for coprocessor 0 registers.
		Adding DEV_MP_MEMORY to dev_mp.c.
		Adding a 'put' command to the debugger.
		ARCBIOS function 0x100 used by IRIX seems to _NOT_ be a
		ReturnFromMain(), but something else undocumented.
		The count and compare registers are now 32-bit in all
		places, as they should be. (This causes, among other things,
		OpenBSD/sgi to not hang randomly in userspace anymore.)
		On breakpoints, the debugger is now entered _at_ the
		instruction at the breakpoint, not after it.
		Some cursor keys now work when inputed via X.
		Refreshing the MC146818 device a bit more.
20050118	Trying to add some support for less-than-4KB virtual pages,
		used by at least VR4131. Thanks to Alexander Yurchenko for
		noticing this. (I'm assuming for now that all R41xx work
		this way, which is not necessarily true.) It doesn't really
		work yet though.
		Renicing the "loading files" messages and other things
		displayed during startup.
		Changing the disassembly output of ori, xori, and andi to
		unsigned hex immediate, instead of decimal (as suggested
		by Alec Voropay).
		configure-script update for HP-UX, and switching from using
		inet_aton() to inet_pton() (as suggested by Nils Weller).
		Also adding -lnsl on Solaris, if required by inet_pton().
		Lots of minor R4100-related updates.
20050119	Correcting the R4100 config register in src/coproc.c, and
		a minor update to dev_vr41xx.
		Finally began a redesign/remodelling/cleanup that I have had
		in mind for quite some time... moving many things that were
		in struct emul into a new struct machine.
		Userland emulation now works with bintrans.
		Refreshing the LANCE controller (dev_le.c).
		Fixing the LK201 keyboard id.
20050120	Continuing on the remodelling/cleanup.
		Fixing the SCSI bug (which was triggered sometimes by
		NetBSD 2.0/pmax on Linux/i386 hosts).
		Adding a speed-limit hack to the mc146818 device when running
		in DECstation mode (limiting to emulated 30 MHz clock, so
		that Ultrix doesn't freak out).
		Adding an ugly workaround for the floating-point bug which
		is triggered when running NetBSD/pmax 2.0 on an Alpha host.
		The count/compare interrupt will not be triggered now, if
		the compare register is left untouched.
		Many, many other fixes...
20050121	Continuing the remodelling/cleanup. (Mostly working on the
		network stack, and on moving towards multiple emulations
		with multiple machines per emulation.)
		Fixbug: not clearing lowest parts of lo0 and hi on tlbr
		(seems to increase performance when emulating Linux?).
20050122	Continuing the remodelling/cleanup.
		Linux on DECstation uses a non-used part of the RTC registers
		for the year value; this is supported now, so Linux thinks
		it is 2005 and not 2000.
		Began hacking on something to reply to Debian's DHCP requests,
		but it's not working yet.
20050123	Continuing the remodelling/cleanup.
20050124	Continuing the remodelling/cleanup.
		Converting the dev_vga charcell memory to support direct
		bintrans access (similar to how dev_fb works), and fixing a
		couple of bintrans bugs in the process.
		The emulator now compiles under OpenBSD/arc 2.3 without
		crashing (mostly due to the bintrans fixes, but also some
		minor updates to the configure script).
20050125	Continuing the remodelling/cleanup.
		The '-a' option was missing in the Hello World example in the
		documentation. (Thanks to Soohyun Cho for noticing this.)
20050126	Continuing the remodelling/cleanup. Moving around stuff in
		the header files, etc. Adding a '-K' command line option, which
		forces the debugger to be entered at the exit of a simulation,
		regardless of failure or success. Beginning to work on the
		config file parser.
		Splitting doc/index.html into experiments.html, guestoses.html,
		intro.html, and misc.html.
		Updating the man page and adding a skeleton section about the
		configure files to doc/misc.html.
20050127	Minor documentation updates.
20050128	Continuing the remodelling/cleanup, mostly working on the
		config file parser (adding a couple of machine words, enough
		to run simple emulations, and adding support for multi-line
		comments using tuborgs).
		Removing some command line options for the least working
		emulation modes (-e, -f, -g, -E, -H), adding new -E and -e
		options for selecting machine type.
		Moving global variables from src/x11.c into struct machine (a
		bit buggy, but it seems to almost work).
20050129	Removing the Playstation 2 mode (-B) and hpcmips modes (-F)
		from the command line as well.
		Changing the -T command line option from meaning "trace on bad
		address" to meaning "enter the single-step debugger on bad
		address".
		More updates to the configuration file parser (nested tuborg
		comments, more options, ...).
		Making -s a global setting, not just affecting one machine.
		Trying to fix the X11 event stuff... but it's so ugly that it
		must be rewritten later.
		Continuing the multi-emul cleanup.
		Bugfixes and other updates to dev_vga.
20050130	Continuing the remodelling/cleanup. Finally moving out the
		MIPS dependant stuff of the cpu struct into its own struct.
		Renaming cpu.c to cpu_mips.c, and cpu_common.c to cpu.c.
		Adding a dummy cpu_ppc.c.
		Removing the UltraSPARC bintrans backend.
		Many other minor updates.
		src/file.c should now be free from MIPS-dependancies.
20050131	Continuing a little bit more on src/file.c. PPC ELFs can now
		be loaded, it seems.
		Continuing on src/cpu_ppc.c.
		'mips' is undefined by the configure script, if it is defined
		by default. (Fixes build on at least OpenBSD/arc and
		NetBSD/arc, where gcc defines 'mips'.)
		A couple of other minor fixes.
		Removing the "Changing framebuffer resolution" section from
		doc/misc.h (because it's buggy and not very useful anway).
		Adding a mystrtoull(), used on systems where there is no
		strtoull() in libc.
		Adding 'add_x11_display' to the configure file parser 
		(corresponding to the -z command line option).
		Continuing the multi-emul machine cleanup.
20050201	Minor updates (man page, RELEASE, README).
		Continuing the cleanup.
		Adding a 'name' field to the emul struct, and adding a command
		to the debugger ("focus") to make it possible to switch focus
		to different machines (in different emuls).
		Beginning to work on the PPC disassembler etc. Hello World
		for linux-ppc64 can be disassembled :-)
20050202	Adding a hack for reading symbols from Microsoft's variant of
		COFF files.
		Adding a dummy cpu_sparc.c and include/cpu_sparc.h.
		Cleaning up more to support multiple cpu families.
		Various other minor updates.
		Fixing another old-gcc-on-Alpha problem.
20050203	Bintrans cache size is now variable, settable by a new
		configuration file option 'bintrans_size'.
		The debugger can now theoretically call disassembler functions
		for cpu families with non-fixed instruction word length.
		Working more on the mec controller. It now works well enough
		to let both NetBSD/sgimips and OpenBSD/sgi connect to the
		outside world using ftp :-)
		Continuing on the cleanup of the networking subsystem.
20050204	Continuing the cleanup.
		Working on a way to use separate xterms for serial ports and
		other console input, when emulating multiple machines (or one
		machine with multiple serial lines active).
20050205	Minor documentation updates.
20050206	Moving console.c from devices/ to src/, and continuing the
		work on using separate windows for each serial console.
		Trying to get OpenBSD/sgi to boot with root-on-nfs on an
		emulated NetBSD/pmax server, but no success in setting up
		the server yet.
20050207	Continuing on the console cleanup.
		Adding a 'start_paused' configuration file option, and a
		'pause' command to the debugger.
20050208	Everything now builds with --withoutmips.
		Continuing on the documentation on how to run OpenBSD/sgi, but
		no actual success yet.
		sizeof => (int)sizeof in the configure script (as suggested by
		Nils Weller).
20050209	Adding a check for -lm to the configure script.
		Continuing on the cleanup: trying to make memory_rw non-MIPS
		dependant.
		Trying to make a better fix for the cdrom-block-size problems
		on FreeBSD. (It now works with a Windows NT 4.0 cdrom in my
		drive.)
		Began a clean-up of the userland subsystem.
20050210	Continuing the userland cleanup.
		IBM's Hello World example for Linux/PPC64 runs fine now.
20050211	Continuing the cleanup. Removing the --userland configure
		option (because support for userland is always included now).
		Working more on getting OpenBSD/sgi to boot with root on
		nfs. (Booting with the ramdisk kernel, and mounting root via
		nfs works, but not yet from the generic kernel.)
		Major update to the manpage.
		Removing the -G command line option (SGI modes).
20050212	Updating the documentation (experimental devices: dev_cons
		and dev_mp, better hello.c, and some other things).
20050213	Some minor fixes: documentation, 80 columns in some source
		files, better configure script options.
		Adding some more PPC instructions.
		Added a NOFPU flag to the MIPS cpu flags, so that executing
		FPU instructions on for example VR4xxx will fail (as suggested
		by Alexander Yurchenko).
20050214	Implementing more PPC instructions.
		Adding dev_pmppc.
20050215	Continuing the work on PPC emulation. Adding a (mostly non-
		working) NetBSD/powerpc userland mode, a (buggy)
		show_trace_tree thing (simliar to the MIPS version).
20050216	Continuing...
20050218	Continuing the clean-up. (Merging the devices and devstate
		debugger commands, more 80-column cleanup, some documentation
		updates, ...).
20050219	Removing the -D, -A, and -a command line options. Updating the
		documentation, in preparation for the next release.
		Adding interrupt stuff to dev_cons.
		Single-stepping now looks/works better with bintrans enabled.
		Beginning the first phase of release testing; various minor
		updates to make everything build cleanly on Solaris.
20050220	Continuing testing for the release...
                
==============  RELEASE 0.3  ==============

20050221	Minor updates. Some more clean-up.
		Beginning on the new device registry stuff.
20050222	Continuing on the device stuff, and doing various other kinds
		of clean-up.
		Adding a dummy BeBox mode.
		Making the pc register common for all cpu families.
		Adding some more PPC instructions and fixing some bugs.
20050223	Continuing on the BeBox stuff, and adding more instructions.
		Adding an ns16550 to the VR4131 emulation (which is probably
		a close enough fake to the VR4131's SIU unit).
20050224	Minor updates. Adding dummy PReP, macppc, and DB64360 modes.
		Continuing on the device registry rewrite.
20050225	Continuing on the device stuff.
20050226	Continuing more on the device rewrite.
		Separating the "testmips" machine into testmips and baremips
		(and similarly with the ppc machine).
		Redesigning the device registry again :-)
		Adding a "device" command to the config file parser.
		Adding "device add" and "device remove" to the debugger.
		Removing pcidevs.h, because it was almost unused.
20050228	Correcting the Sprite disk image url in the documentation.
20050301	Adding an URISC cpu emulation mode (single-opcode machine).
20050303	Adding some files to the experiments directory (rssb_as.c,
		rssb_as.README, urisc_test.s).
		Continuing on the device stuff.
20050304	Minor documentation update. Also, the SPARC, PPC, and URISC
		modes are now enabled by default in the configure script.
		Some minor PPC updates (adding a VGA device to the bebox
		emulation mode).
20050305	Moving the static i386 bintrans runchunk code snippet (and the
		others) to be dynamically generated. (This allows the code to
		compile on i386 with old gcc.)
		Loading PPC64 ELFs now sets R2 to the TOC base.
		Changing the name of the emulator from mips64emul to GXemul.
		Splitting out the configuration file part of the documentation
		into its own file (configfiles.html).
20050306	Some really minor documentation updates.
		Adding a -D command line option (for "fully deterministic"
		behaviour).
20050308	Minor PPC updates. Adding a dummy OpenFirmware emulation layer.
20050309	Adding a hack for systems without inet_pton (such as Cygwin in
		Windows) as suggested by Soohyun Cho. (And updating the
		configure script too.)
		Adding a dummy HPPA cpu family.
		Some more OpenFirmware updates.
		Faster loading of badly aligned ELF regions.
20050311	Minor updates. Adding a dummy "NEC MobilePro 780" hpcmips
		machine mode; disabling direct bintrans access to framebuffers
		that are not 4K page aligned.
20050312	Adding an ugly KIU hack to the VR41xx device (which enables
		NetBSD/hpcmips MobilePro 780 keyboard input).
20050313	Adding a dummy "pcic" device (a pcmcia card controller).
		Adding a dummy Alpha cpu emulation mode.
		Fixing a strcmp length bug (thanks to Alexander Yurchenko for
		noticing the bug).
20050314	Some minor bintrans-related updates in preparation for a new
		bintrans subsystem: command line option -b now means "old
		bintrans", -B means "disable bintrans", and using no option at
		all selects "new bintrans".
		Better generation of MAC addresses when emulating multiple
		machines and/or NICs.
		Minor documentation updates (regarding configuration files).
20050315	Adding dummy standby, suspend, and hibernate MIPS opcodes.
		RTC interrupt hack for VR4121 (hpcmips).
		Enough of the pcic is now emulated to let NetBSD/hpcmips detect
		a PCMCIA harddisk controller card (but there is no support for
		ISA/PCMCIA interrupts yet).
		Adding preliminary instructions on how to install
		NetBSD/hpcmips.
		Continuing the attempt to get harddisks working with interrupts
		(pcic, wdc on hpcmips).
20050318	Minor updates. (Fixing disassembly of MIPS bgtz etc., 
		continuing on the device cleanup, ...)
20050319	Minor updates.
20050320	Minor updates.
20050322	Various minor updates.
20050323	Some more minor updates.
20050328	VR41xx-related updates (keyboard stuff: the space key and
		shifted and ctrled keys are now working in userland (ie
		NetBSD/hpcmips' ramdisk installer).
		Also adding simple cursor key support to the VR41xx kiu.
20050329	Some progress on the wdc.
		Updating the documentation of how to (possibly) install
		NetBSD/hpcmips, once it is working.
		Adding delays before wdc interrupts; this allows NetBSD
		2.0/hpcmips to be successfully installed!
		Mirroring physical addresses 0x8....... to 0x00000000 on
		hpcmips; this makes it possible to run X11 inside
		NetBSD/hpcmips :-)
		Updating the documentation regarding NetBSD/hpcmips.
		Fixing 16-bit vs 15-bit color in dev_fb.
20050330	Print a warning when the user attempts to load a gzipped
		file. (Thanks to Juan RP for making me aware of this "bug".)
20050331	Importing aic7xxx_reg.h from NetBSD.
		Adding a "-x" command line option, which forces xterms for
		each emulated serial port to always be opened.
		Adding a MobilePro 770 mode (same as 780, but different
		framebuffer address which allows bintrans = fast scrolling),
		and a MobilePro 800 (with 800x600 pixels framebuffer :-).
20050401	Minor updates.
20050402	Minor updates. (The standby and suspend instructions are
		bintransed as NOPs, and some minor documentation updates.)
20050403	Adding an Agenda VR3 mode, and playing around with a Linux
		kernel image, but not much success yet.
		Changing BIFB_D16_FFFF -> BIFB_D16_0000 for the hpcmips 
		framebuffers, causing NetBSD to boot with correct colors.
		New syntax for loading raw files: loadaddr:skiplen:
		initialpc:filename. (This is necessary to boot the Linux VR3
		kernels.)
		The Linux VR3 kernel boots in both serial console mode and
		using the framebuffer, but it panics relatively early.
20050404	Continuing on the AHC, and some other minor updates.
20050405	Adding a note in doc/experimental.html about "root1.2.6.cramfs"
		(thanks to Alec Voropay for noticing that it wasn't part
		of root1.2.6.kernel-8.00).
		Also adding a note about another cramfs image.
		-o options are now added to the command line passed to the
		Linux kernel, when emulating the VR3.
		Adding a MobilePro 880 mode, and a dummy IBM WorkPad Z50 mode.
20050406	Connecting the VR3 serial controller to irq 9 (Linux calls this
		irq 17), and some other interrupt-related cleanups.
		Reducing the memory overhead per bintranslated page. (Hopefully
		this makes things faster, or at least not slower...)
20050407	Some more cleanup regarding command line argument passing for
		the hpcmips modes.
		Playing with Linux kernels for MobilePro 770 and 800; they get
		as far as mounting a root filesystem, but then crash.
		Doing some testing for the next release.

==============  RELEASE 0.3.1  ==============


1 dpavlin 2 /*
2     * Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
3     *
4     * Redistribution and use in source and binary forms, with or without
5     * modification, are permitted provided that the following conditions are met:
6     *
7     * 1. Redistributions of source code must retain the above copyright
8     * notice, this list of conditions and the following disclaimer.
9     * 2. Redistributions in binary form must reproduce the above copyright
10     * notice, this list of conditions and the following disclaimer in the
11     * documentation and/or other materials provided with the distribution.
12     * 3. The name of the author may not be used to endorse or promote products
13     * derived from this software without specific prior written permission.
14     *
15     * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
16     * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17     * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18     * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19     * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20     * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21     * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22     * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24     * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25     * SUCH DAMAGE.
26     *
27     *
28     * $Id: bintrans_alpha.c,v 1.114 2005/03/22 09:12:04 debug Exp $
29     *
30     * Alpha specific code for dynamic binary translation.
31     *
32     * See bintrans.c for more information. Included from bintrans.c.
33     *
34     *
35     * Some Alpha registers that are reasonable to use:
36     *
37     * t5..t7 6..8 3
38     * s0..s6 9..15 7
39     * a1..a5 17..21 5
40     * t8..t11 22..25 4
41     *
42     * These can be "mapped" to MIPS registers in the translated code,
43     * except a0 which points to the cpu struct, and t0..t4 (or so)
44     * which are used by the translated code as temporaries.
45     *
46     * 3 + 7 + 5 + 4 = 19 available registers. Of course, all (except
47     * s0..s6) must be saved when calling external functions, such as
48     * when doing load/store.
49     *
50     * Which are the 19 most commonly used MIPS registers? (This will
51     * include the pc, and the "current number of executed translated
52     * instructions.)
53     *
54     * The current allocation is as follows:
55     *
56     * Alpha: MIPS:
57     * ------ -----
58     *
59     * t5 pc (64-bit)
60     * t6 bintrans_instructions_executed (32-bit int)
61     * t7 a0 (mips register 4) (64-bit)
62     * t8 a1 (mips register 5) (64-bit)
63     * t9 s0 (mips register 16) (64-bit)
64     * t10 table0 cached (for load/store)
65     * t11 v0 (mips register 2) (64-bit)
66     * s0 delay_slot (32-bit int)
67     * s1 delay_jmpaddr (64-bit)
68     * s2 sp (mips register 29) (64-bit)
69     * s3 ra (mips register 31) (64-bit)
70     * s4 t0 (mips register 8) (64-bit)
71     * s5 t1 (mips register 9) (64-bit)
72     * s6 t2 (mips register 10) (64-bit)
73     */
74    
75     #define MIPSREG_PC -3
76     #define MIPSREG_DELAY_SLOT -2
77     #define MIPSREG_DELAY_JMPADDR -1
78    
79     #define ALPHA_T0 1
80     #define ALPHA_T1 2
81     #define ALPHA_T2 3
82     #define ALPHA_T3 4
83     #define ALPHA_T4 5
84     #define ALPHA_T5 6
85     #define ALPHA_T6 7
86     #define ALPHA_T7 8
87     #define ALPHA_S0 9
88     #define ALPHA_S1 10
89     #define ALPHA_S2 11
90     #define ALPHA_S3 12
91     #define ALPHA_S4 13
92     #define ALPHA_S5 14
93     #define ALPHA_S6 15
94     #define ALPHA_A0 16
95     #define ALPHA_A1 17
96     #define ALPHA_A2 18
97     #define ALPHA_A3 19
98     #define ALPHA_A4 20
99     #define ALPHA_A5 21
100     #define ALPHA_T8 22
101     #define ALPHA_T9 23
102     #define ALPHA_T10 24
103     #define ALPHA_T11 25
104     #define ALPHA_ZERO 31
105    
106     static int map_MIPS_to_Alpha[32] = {
107     ALPHA_ZERO, -1, ALPHA_T11, -1, /* 0 .. 3 */
108     ALPHA_T7, ALPHA_T8, -1, -1, /* 4 .. 7 */
109     ALPHA_S4, ALPHA_S5, ALPHA_S6, -1, /* 8 .. 11 */
110     -1, -1, -1, -1, /* 12 .. 15 */
111     ALPHA_T9, -1, -1, -1, /* 16 .. 19 */
112     -1, -1, -1, -1, /* 20 .. 23 */
113     -1, -1, -1, -1, /* 24 .. 27 */
114     -1, ALPHA_S2, -1, ALPHA_S3, /* 28 .. 31 */
115     };
116    
117    
118     struct cpu dummy_cpu;
119     struct mips_coproc dummy_coproc;
120     struct vth32_table dummy_vth32_table;
121    
122     unsigned char bintrans_alpha_imb[32] = {
123     0x86, 0x00, 0x00, 0x00, /* imb */
124     0x01, 0x80, 0xfa, 0x6b, /* ret */
125     0x1f, 0x04, 0xff, 0x47, /* nop */
126     0x00, 0x00, 0xfe, 0x2e, /* unop */
127     0x1f, 0x04, 0xff, 0x47, /* nop */
128     0x00, 0x00, 0xfe, 0x2e, /* unop */
129     0x1f, 0x04, 0xff, 0x47, /* nop */
130     0x00, 0x00, 0xfe, 0x2e /* unop */
131     };
132    
133    
134     /*
135     * bintrans_host_cacheinvalidate()
136     *
137     * Invalidate the host's instruction cache. On Alpha, we do this by
138     * executing an imb instruction.
139     *
140     * NOTE: A simple asm("imb"); would be enough here, but not all
141     * compilers have such simple constructs, so an entire function has to
142     * be written as bintrans_alpha_imb[] above.
143     */
144     static void bintrans_host_cacheinvalidate(unsigned char *p, size_t len)
145     {
146     /* Long form of ``asm("imb");'' */
147    
148     void (*f)(void);
149     f = (void *)&bintrans_alpha_imb[0];
150     f();
151     }
152    
153    
154     /*
155     * lda sp,-128(sp) some margin
156     * stq ra,0(sp)
157     * stq s0,8(sp)
158     * stq s1,16(sp)
159     * stq s2,24(sp)
160     * stq s3,32(sp)
161     * stq s4,40(sp)
162     * stq s5,48(sp)
163     * stq s6,56(sp)
164     *
165     * jsr ra,(a1),<back>
166     * back:
167     *
168     * ldq ra,0(sp)
169     * ldq s0,8(sp)
170     * ldq s1,16(sp)
171     * ldq s2,24(sp)
172     * ldq s3,32(sp)
173     * ldq s4,40(sp)
174     * ldq s5,48(sp)
175     * ldq s6,56(sp)
176     * lda sp,128(sp)
177     * ret
178     */
179     /* note: offsetof (in stdarg.h) could possibly be used, but I'm not sure
180     if it will take care of the compiler problems... */
181     #define ofs_pc (((size_t)&dummy_cpu.pc) - ((size_t)&dummy_cpu))
182     #define ofs_pc_last (((size_t)&dummy_cpu.cd.mips.pc_last) - ((size_t)&dummy_cpu))
183     #define ofs_n (((size_t)&dummy_cpu.cd.mips.bintrans_instructions_executed) - ((size_t)&dummy_cpu))
184     #define ofs_ds (((size_t)&dummy_cpu.cd.mips.delay_slot) - ((size_t)&dummy_cpu))
185     #define ofs_ja (((size_t)&dummy_cpu.cd.mips.delay_jmpaddr) - ((size_t)&dummy_cpu))
186     #define ofs_sp (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_SP]) - ((size_t)&dummy_cpu))
187     #define ofs_ra (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_RA]) - ((size_t)&dummy_cpu))
188     #define ofs_a0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_A0]) - ((size_t)&dummy_cpu))
189     #define ofs_a1 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_A1]) - ((size_t)&dummy_cpu))
190     #define ofs_t0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T0]) - ((size_t)&dummy_cpu))
191     #define ofs_t1 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T1]) - ((size_t)&dummy_cpu))
192     #define ofs_t2 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_T2]) - ((size_t)&dummy_cpu))
193     #define ofs_v0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_V0]) - ((size_t)&dummy_cpu))
194     #define ofs_s0 (((size_t)&dummy_cpu.cd.mips.gpr[MIPS_GPR_S0]) - ((size_t)&dummy_cpu))
195     #define ofs_tbl0 (((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - ((size_t)&dummy_cpu))
196     #define ofs_c0 ((size_t)&dummy_vth32_table.bintrans_chunks[0] - (size_t)&dummy_vth32_table)
197     #define ofs_cb (((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu)
198    
199    
200     static uint32_t bintrans_alpha_loadstore_32bit[19] = {
201     /*
202     * t1 = 1023;
203     * t2 = ((a1 >> 22) & t1) * sizeof(void *);
204     * t3 = ((a1 >> 12) & t1) * sizeof(void *);
205     * t1 = a1 & 4095;
206     *
207     * f8 1f 5f 20 lda t1,1023 * 8
208     * 83 76 22 4a srl a1,19,t2
209     * 84 36 21 4a srl a1, 9,t3
210     * 03 00 62 44 and t2,t1,t2
211     */
212     0x205f1ff8,
213     0x4a227683,
214     0x4a213684,
215     0x44620003,
216    
217     /*
218     * t10 is vaddr_to_hostaddr_table0
219     *
220     * a3 = tbl0[t2] (load entry from tbl0)
221     * 12 04 03 43 addq t10,t2,a2
222     */
223     0x43030412,
224    
225     /* 04 00 82 44 and t3,t1,t3 */
226     0x44820004,
227    
228     /* 00 00 72 a6 ldq a3,0(a2) */
229     0xa6720000,
230    
231     /* ff 0f 5f 20 lda t1,4095 */
232     0x205f0fff,
233    
234     /*
235     * a3 = tbl1[t3] (load entry from tbl1 (which is a3))
236     * 13 04 64 42 addq a3,t3,a3
237     */
238     0x42640413,
239    
240     /* 02 00 22 46 and a1,t1,t1 */
241     0x46220002,
242    
243     /* 00 00 73 a6 ldq a3,0(a3) */
244     0xa6730000,
245    
246     /* NULL? Then return failure at once. */
247     /* bne a3, skip */
248     0xf6600003,
249    
250     0x243f0000 | (BINTRANS_DONT_RUN_NEXT >> 16), /* ldah t0,256 */
251     0x44270407, /* or t0,t6,t6 */
252     0x6bfa8001, /* ret */
253    
254     /* skip: */
255    
256     /* 01 30 60 46 and a3,0x1,t0 */
257     0x46603001,
258    
259     /* Get rid of the lowest bit: */
260     /* 33 05 61 42 subq a3,t0,a3 */
261     0x42610533,
262    
263     /* The rest of the load/store code was written with t3 as the address. */
264    
265     /* Add the offset within the page: */
266     /* 04 04 62 42 addq a3,t1,t3 */
267     0x42620404,
268    
269     0x6be50000 /* jmp (t4) */
270     };
271    
272     static void (*bintrans_runchunk)(struct cpu *, unsigned char *);
273    
274     static void (*bintrans_jump_to_32bit_pc)(struct cpu *);
275    
276     static void (*bintrans_loadstore_32bit)
277     (struct cpu *) = (void *)bintrans_alpha_loadstore_32bit;
278    
279    
280     /*
281     * bintrans_write_quickjump():
282     */
283     static void bintrans_write_quickjump(struct memory *mem,
284     unsigned char *quickjump_code, uint32_t chunkoffset)
285     {
286     int ofs;
287     uint64_t alpha_addr = chunkoffset +
288     (size_t)mem->translation_code_chunk_space;
289     uint32_t *a = (uint32_t *)quickjump_code;
290    
291     ofs = (alpha_addr - ((size_t)a+4)) / 4;
292    
293     /* printf("chunkoffset=%i, %016llx %016llx %i\n",
294     chunkoffset, (long long)alpha_addr, (long long)a, ofs); */
295    
296     if (ofs > -0xfffff && ofs < 0xfffff) {
297     *a++ = 0xc3e00000 | (ofs & 0x1fffff); /* br <chunk> */
298     }
299     }
300    
301    
302     /*
303     * bintrans_write_chunkreturn():
304     */
305     static void bintrans_write_chunkreturn(unsigned char **addrp)
306     {
307     uint32_t *a = (uint32_t *) *addrp;
308     *a++ = 0x6bfa8001; /* ret */
309     *addrp = (unsigned char *) a;
310     }
311    
312    
313     /*
314     * bintrans_write_chunkreturn_fail():
315     */
316     static void bintrans_write_chunkreturn_fail(unsigned char **addrp)
317     {
318     uint32_t *a = (uint32_t *) *addrp;
319     /* 00 01 3f 24 ldah t0,256 */
320     /* 07 04 27 44 or t0,t6,t6 */
321     *a++ = 0x243f0000 | (BINTRANS_DONT_RUN_NEXT >> 16);
322     *a++ = 0x44270407;
323     *a++ = 0x6bfa8001; /* ret */
324     *addrp = (unsigned char *) a;
325     }
326    
327    
328     /*
329     * bintrans_move_MIPS_reg_into_Alpha_reg():
330     */
331     static void bintrans_move_MIPS_reg_into_Alpha_reg(unsigned char **addrp, int mipsreg, int alphareg)
332     {
333     uint32_t *a = (uint32_t *) *addrp;
334     int ofs, alpha_mips_reg;
335    
336     switch (mipsreg) {
337     case MIPSREG_PC:
338     /* addq t5,0,alphareg */
339     *a++ = 0x40c01400 | alphareg;
340     break;
341     case MIPSREG_DELAY_SLOT:
342     /* addq s0,0,alphareg */
343     *a++ = 0x41201400 | alphareg;
344     break;
345     case MIPSREG_DELAY_JMPADDR:
346     /* addq s1,0,alphareg */
347     *a++ = 0x41401400 | alphareg;
348     break;
349     default:
350     alpha_mips_reg = map_MIPS_to_Alpha[mipsreg];
351     if (alpha_mips_reg < 0) {
352     ofs = ((size_t)&dummy_cpu.cd.mips.gpr[mipsreg]) - (size_t)&dummy_cpu;
353     /* ldq alphareg,gpr[mipsreg](a0) */
354     *a++ = 0xa4100000 | (alphareg << 21) | ofs;
355     } else {
356     /* addq alpha_mips_reg,0,alphareg */
357     *a++ = 0x40001400 | (alpha_mips_reg << 21) | alphareg;
358     }
359     }
360     *addrp = (unsigned char *) a;
361     }
362    
363    
364     /*
365     * bintrans_move_Alpha_reg_into_MIPS_reg():
366     */
367     static void bintrans_move_Alpha_reg_into_MIPS_reg(unsigned char **addrp, int alphareg, int mipsreg)
368     {
369     uint32_t *a = (uint32_t *) *addrp;
370     int ofs, alpha_mips_reg;
371    
372     switch (mipsreg) {
373     case MIPSREG_PC:
374     /* addq alphareg,0,t5 */
375     *a++ = 0x40001406 | (alphareg << 21);
376     break;
377     case MIPSREG_DELAY_SLOT:
378     /* addq alphareg,0,s0 */
379     *a++ = 0x40001409 | (alphareg << 21);
380     break;
381     case MIPSREG_DELAY_JMPADDR:
382     /* addq alphareg,0,s1 */
383     *a++ = 0x4000140a | (alphareg << 21);
384     break;
385     case 0: /* the zero register */
386     break;
387     default:
388     alpha_mips_reg = map_MIPS_to_Alpha[mipsreg];
389     if (alpha_mips_reg < 0) {
390     /* stq alphareg,gpr[mipsreg](a0) */
391     ofs = ((size_t)&dummy_cpu.cd.mips.gpr[mipsreg]) - (size_t)&dummy_cpu;
392     *a++ = 0xb4100000 | (alphareg << 21) | ofs;
393     } else {
394     /* addq alphareg,0,alpha_mips_reg */
395     *a++ = 0x40001400 | (alphareg << 21) | alpha_mips_reg;
396     }
397     }
398     *addrp = (unsigned char *) a;
399     }
400    
401    
402     /*
403     * bintrans_write_pc_inc():
404     */
405     static void bintrans_write_pc_inc(unsigned char **addrp)
406     {
407     uint32_t *a = (uint32_t *) *addrp;
408    
409     /* lda t6,1(t6) */
410     *a++ = 0x20e70001;
411    
412     /* lda t5,4(t5) */
413     *a++ = 0x20c60004;
414    
415     *addrp = (unsigned char *) a;
416     }
417    
418    
419     /*
420     * bintrans_write_instruction__addiu_etc():
421     */
422     static int bintrans_write_instruction__addiu_etc(unsigned char **addrp,
423     int rt, int rs, int imm, int instruction_type)
424     {
425     uint32_t *a;
426     unsigned int uimm;
427     int alpha_rs, alpha_rt;
428    
429     /* TODO: overflow detection for ADDI and DADDI */
430     switch (instruction_type) {
431     case HI6_ADDI:
432     case HI6_DADDI:
433     return 0;
434     }
435    
436     a = (uint32_t *) *addrp;
437    
438     if (rt == 0)
439     goto rt0;
440    
441     uimm = imm & 0xffff;
442    
443     alpha_rs = map_MIPS_to_Alpha[rs];
444     alpha_rt = map_MIPS_to_Alpha[rt];
445    
446     if (uimm == 0 && (instruction_type == HI6_ADDI ||
447     instruction_type == HI6_ADDIU || instruction_type == HI6_DADDI ||
448     instruction_type == HI6_DADDIU || instruction_type == HI6_ORI)) {
449     if (alpha_rs >= 0 && alpha_rt >= 0) {
450     /* addq rs,0,rt */
451     *a++ = 0x40001400 | (alpha_rs << 21) | alpha_rt;
452     } else {
453     *addrp = (unsigned char *) a;
454     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T0);
455     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
456     a = (uint32_t *) *addrp;
457     }
458     goto rt0;
459     }
460    
461     if (alpha_rs < 0) {
462     /* ldq t0,"rs"(a0) */
463     *addrp = (unsigned char *) a;
464     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T0);
465     a = (uint32_t *) *addrp;
466     alpha_rs = ALPHA_T0;
467     }
468    
469     if (alpha_rt < 0)
470     alpha_rt = ALPHA_T0;
471    
472     /* Place the result of the calculation in alpha_rt: */
473    
474     switch (instruction_type) {
475     case HI6_ADDIU:
476     case HI6_DADDIU:
477     case HI6_ADDI:
478     case HI6_DADDI:
479     if (uimm < 256) {
480     if (instruction_type == HI6_ADDI ||
481     instruction_type == HI6_ADDIU) {
482     /* addl rs,uimm,rt */
483     *a++ = 0x40001000 | (alpha_rs << 21)
484     | (uimm << 13) | alpha_rt;
485     } else {
486     /* addq rs,uimm,rt */
487     *a++ = 0x40001400 | (alpha_rs << 21)
488     | (uimm << 13) | alpha_rt;
489     }
490     } else {
491     /* lda rt,imm(rs) */
492     *a++ = 0x20000000 | (alpha_rt << 21) | (alpha_rs << 16) | uimm;
493     if (instruction_type == HI6_ADDI ||
494     instruction_type == HI6_ADDIU) {
495     /* sign extend, 32->64 bits: addl t0,zero,t0 */
496     *a++ = 0x40001000 | (alpha_rt << 21) | alpha_rt;
497     }
498     }
499     break;
500     case HI6_ANDI:
501     case HI6_ORI:
502     case HI6_XORI:
503     if (uimm >= 256) {
504     /* lda t1,4660 */
505     *a++ = 0x205f0000 | uimm;
506     if (uimm & 0x8000) {
507     /* 01 00 42 24 ldah t1,1(t1) <-- if negative only */
508     *a++ = 0x24420001;
509     }
510     }
511    
512     switch (instruction_type) {
513     case HI6_ANDI:
514     if (uimm < 256) {
515     /* and rs,uimm,rt */
516     *a++ = 0x44001000 | (alpha_rs << 21)
517     | (uimm << 13) | alpha_rt;
518     } else {
519     /* and rs,t1,rt */
520     *a++ = 0x44020000 | (alpha_rs << 21) | alpha_rt;
521     }
522     break;
523     case HI6_ORI:
524     if (uimm < 256) {
525     /* or rs,uimm,rt */
526     *a++ = 0x44001400 | (alpha_rs << 21)
527     | (uimm << 13) | alpha_rt;
528     } else {
529     /* or rs,t1,rt */
530     *a++ = 0x44020400 | (alpha_rs << 21) | alpha_rt;
531     }
532     break;
533     case HI6_XORI:
534     if (uimm < 256) {
535     /* xor rs,uimm,rt */
536     *a++ = 0x44001800 | (alpha_rs << 21)
537     | (uimm << 13) | alpha_rt;
538     } else {
539     /* xor rs,t1,rt */
540     *a++ = 0x44020800 | (alpha_rs << 21) | alpha_rt;
541     }
542     break;
543     }
544     break;
545     case HI6_SLTI:
546     case HI6_SLTIU:
547     /* lda t1,4660 */
548     *a++ = 0x205f0000 | uimm;
549    
550     switch (instruction_type) {
551     case HI6_SLTI:
552     /* cmplt rs,t1,rt */
553     *a++ = 0x400209a0 | (alpha_rs << 21) | alpha_rt;
554     break;
555     case HI6_SLTIU:
556     /* cmpult rs,t1,rt */
557     *a++ = 0x400203a0 | (alpha_rs << 21) | alpha_rt;
558     break;
559     }
560     break;
561     }
562    
563     if (alpha_rt == ALPHA_T0) {
564     *a++ = 0x5fff041f; /* fnop */
565     *addrp = (unsigned char *) a;
566     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
567     a = (uint32_t *) *addrp;
568     }
569    
570     rt0:
571     *addrp = (unsigned char *) a;
572     bintrans_write_pc_inc(addrp);
573     return 1;
574     }
575    
576    
577     /*
578     * bintrans_write_instruction__addu_etc():
579     */
580     static int bintrans_write_instruction__addu_etc(unsigned char **addrp,
581     int rd, int rs, int rt, int sa, int instruction_type)
582     {
583     unsigned char *a, *unmodified = NULL;
584     int load64 = 0, store = 1, ofs, alpha_rd = ALPHA_T0;
585    
586     alpha_rd = map_MIPS_to_Alpha[rd];
587     if (alpha_rd < 0)
588     alpha_rd = ALPHA_T0;
589    
590     switch (instruction_type) {
591     case SPECIAL_DIV:
592     case SPECIAL_DIVU:
593     return 0;
594     }
595    
596     switch (instruction_type) {
597     case SPECIAL_DADDU:
598     case SPECIAL_DSUBU:
599     case SPECIAL_OR:
600     case SPECIAL_AND:
601     case SPECIAL_NOR:
602     case SPECIAL_XOR:
603     case SPECIAL_DSLL:
604     case SPECIAL_DSRL:
605     case SPECIAL_DSRA:
606     case SPECIAL_DSLL32:
607     case SPECIAL_DSRL32:
608     case SPECIAL_DSRA32:
609     case SPECIAL_SLT:
610     case SPECIAL_SLTU:
611     case SPECIAL_MOVZ:
612     case SPECIAL_MOVN:
613     load64 = 1;
614     }
615    
616     switch (instruction_type) {
617     case SPECIAL_MULT:
618     case SPECIAL_MULTU:
619     if (rd != 0)
620     return 0;
621     store = 0;
622     break;
623     default:
624     if (rd == 0)
625     goto rd0;
626     }
627    
628     a = *addrp;
629    
630     if ((instruction_type == SPECIAL_ADDU || instruction_type == SPECIAL_DADDU
631     || instruction_type == SPECIAL_OR) && rt == 0) {
632     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
633     if (!load64) {
634     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
635     }
636     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
637     *addrp = a;
638     goto rd0;
639     }
640    
641     /* t0 = rs, t1 = rt */
642     if (load64) {
643     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
644     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T1);
645     } else {
646     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
647     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
648     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T1);
649     *a++ = 0x02; *a++ = 0x10; *a++ = 0x40; *a++ = 0x40; /* addl t1,0,t1 */
650     }
651    
652     switch (instruction_type) {
653     case SPECIAL_ADDU:
654     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x22; *a++ = 0x40; /* addl t0,t1,rd */
655     break;
656     case SPECIAL_DADDU:
657     *a++ = alpha_rd; *a++ = 0x04; *a++ = 0x22; *a++ = 0x40; /* addq t0,t1,rd */
658     break;
659     case SPECIAL_SUBU:
660     *a++ = 0x20 + alpha_rd; *a++ = 0x01; *a++ = 0x22; *a++ = 0x40; /* subl t0,t1,t0 */
661     break;
662     case SPECIAL_DSUBU:
663     *a++ = 0x20 + alpha_rd; *a++ = 0x05; *a++ = 0x22; *a++ = 0x40; /* subq t0,t1,t0 */
664     break;
665     case SPECIAL_AND:
666     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x22; *a++ = 0x44; /* and t0,t1,t0 */
667     break;
668     case SPECIAL_OR:
669     *a++ = alpha_rd; *a++ = 0x04; *a++ = 0x22; *a++ = 0x44; /* or t0,t1,t0 */
670     break;
671     case SPECIAL_NOR:
672     *a++ = 0x01; *a++ = 0x04; *a++ = 0x22; *a++ = 0x44; /* or t0,t1,t0 */
673     *a++ = alpha_rd; *a++ = 0x05; *a++ = 0xe1; *a++ = 0x47; /* not t0,t0 */
674     break;
675     case SPECIAL_XOR:
676     *a++ = alpha_rd; *a++ = 0x08; *a++ = 0x22; *a++ = 0x44; /* xor t0,t1,t0 */
677     break;
678     case SPECIAL_SLL:
679     *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
680     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
681     break;
682     case SPECIAL_SLLV:
683     /* rd = rt << (rs&31) (logical) t0 = t1 << (t0&31) */
684     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
685     *a++ = 0x21; *a++ = 0x07; *a++ = 0x41; *a++ = 0x48; /* sll t1,t0,t0 */
686     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
687     break;
688     case SPECIAL_DSLL:
689     *a++ = 0x20 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
690     break;
691     case SPECIAL_DSLL32:
692     sa += 32;
693     *a++ = 0x20 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
694     break;
695     case SPECIAL_SRA:
696     *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
697     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
698     break;
699     case SPECIAL_SRAV:
700     /* rd = rt >> (rs&31) (arithmetic) t0 = t1 >> (t0&31) */
701     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
702     *a++ = 0x81; *a++ = 0x07; *a++ = 0x41; *a++ = 0x48; /* sra t1,t0,t0 */
703     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
704     break;
705     case SPECIAL_DSRA:
706     *a++ = 0x80 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
707     break;
708     case SPECIAL_DSRA32:
709     sa += 32;
710     *a++ = 0x80 + alpha_rd; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
711     break;
712     case SPECIAL_SRL:
713     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48; /* zapnot t1,0xf,t1 (use only lowest 32 bits) */
714     /* Note: bits of sa are distributed among two different bytes. */
715     *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
716     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl */
717     break;
718     case SPECIAL_SRLV:
719     /* rd = rt >> (rs&31) (logical) t0 = t1 >> (t0&31) */
720     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48; /* zapnot t1,0xf,t1 (use only lowest 32 bits) */
721     *a++ = 0x01; *a++ = 0xf0; *a++ = 0x23; *a++ = 0x44; /* and t0,31,t0 */
722     *a++ = 0x81; *a++ = 0x06; *a++ = 0x41; *a++ = 0x48; /* srl t1,t0,t0 */
723     *a++ = alpha_rd; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
724     break;
725     case SPECIAL_DSRL:
726     /* Note: bits of sa are distributed among two different bytes. */
727     *a++ = 0x80 + alpha_rd; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
728     break;
729     case SPECIAL_DSRL32:
730     /* Note: bits of sa are distributed among two different bytes. */
731     sa += 32;
732     *a++ = 0x80 + alpha_rd; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
733     break;
734     case SPECIAL_SLT:
735     *a++ = 0xa0 + alpha_rd; *a++ = 0x09; *a++ = 0x22; *a++ = 0x40; /* cmplt t0,t1,t0 */
736     break;
737     case SPECIAL_SLTU:
738     *a++ = 0xa0 + alpha_rd; *a++ = 0x03; *a++ = 0x22; *a++ = 0x40; /* cmpult t0,t1,t0 */
739     break;
740     case SPECIAL_MULT:
741     case SPECIAL_MULTU:
742     if (instruction_type == SPECIAL_MULTU) {
743     /* 21 f6 21 48 zapnot t0,0xf,t0 */
744     /* 22 f6 41 48 zapnot t1,0xf,t1 */
745     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x21; *a++ = 0x48;
746     *a++ = 0x22; *a++ = 0xf6; *a++ = 0x41; *a++ = 0x48;
747     }
748    
749     /* 03 04 22 4c mulq t0,t1,t2 */
750     *a++ = 0x03; *a++ = 0x04; *a++ = 0x22; *a++ = 0x4c;
751    
752     /* 01 10 60 40 addl t2,0,t0 */
753     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
754    
755     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
756     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
757    
758     /* 81 17 64 48 sra t2,0x20,t0 */
759     *a++ = 0x81; *a++ = 0x17; *a++ = 0x64; *a++ = 0x48;
760     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,0,t0 */
761     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
762     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
763     break;
764     case SPECIAL_MOVZ:
765     /* if rt=0 then rd=rs ==> if t1!=0 then t0=unmodified else t0=rd */
766     /* 00 00 40 f4 bne t1,unmodified */
767     unmodified = a;
768     *a++ = 0x00; *a++ = 0x00; *a++ = 0x40; *a++ = 0xf4;
769     alpha_rd = ALPHA_T0;
770     break;
771     case SPECIAL_MOVN:
772     /* if rt!=0 then rd=rs ==> if t1=0 then t0=unmodified else t0=rd */
773     /* 00 00 40 e4 beq t1,unmodified */
774     unmodified = a;
775     *a++ = 0x00; *a++ = 0x00; *a++ = 0x40; *a++ = 0xe4;
776     alpha_rd = ALPHA_T0;
777     break;
778     }
779    
780     if (store && alpha_rd == ALPHA_T0) {
781     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
782     }
783    
784     if (unmodified != NULL)
785     *unmodified = ((size_t)a - (size_t)unmodified - 4) / 4;
786    
787     *addrp = a;
788     rd0:
789     bintrans_write_pc_inc(addrp);
790     return 1;
791     }
792    
793    
794     /*
795     * bintrans_write_instruction__branch():
796     */
797     static int bintrans_write_instruction__branch(unsigned char **addrp,
798     int instruction_type, int regimm_type, int rt, int rs, int imm)
799     {
800     uint32_t *a, *b, *c = NULL;
801     int alpha_rs, alpha_rt, likely = 0, ofs;
802    
803     alpha_rs = map_MIPS_to_Alpha[rs];
804     alpha_rt = map_MIPS_to_Alpha[rt];
805    
806     switch (instruction_type) {
807     case HI6_BEQL:
808     case HI6_BNEL:
809     case HI6_BLEZL:
810     case HI6_BGTZL:
811     likely = 1;
812     }
813    
814     /*
815     * t0 = gpr[rt]; t1 = gpr[rs];
816     *
817     * 50 00 30 a4 ldq t0,80(a0)
818     * 58 00 50 a4 ldq t1,88(a0)
819     */
820    
821     switch (instruction_type) {
822     case HI6_BEQ:
823     case HI6_BNE:
824     case HI6_BEQL:
825     case HI6_BNEL:
826     if (alpha_rt < 0) {
827     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rt, ALPHA_T0);
828     alpha_rt = ALPHA_T0;
829     }
830     }
831    
832     if (alpha_rs < 0) {
833     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_T1);
834     alpha_rs = ALPHA_T1;
835     }
836    
837     a = (uint32_t *) *addrp;
838    
839     /*
840     * Compare alpha_rt (t0) and alpha_rs (t1) for equality (BEQ).
841     * If the result was false (equal to zero), then skip a lot
842     * of instructions:
843     *
844     * a1 05 22 40 cmpeq t0,t1,t0
845     * 01 00 20 e4 beq t0,14 <f+0x14>
846     */
847     b = NULL;
848     if ((instruction_type == HI6_BEQ ||
849     instruction_type == HI6_BEQL) && rt != rs) {
850     /* cmpeq rt,rs,t0 */
851     *a++ = 0x400005a1 | (alpha_rt << 21) | (alpha_rs << 16);
852     b = a;
853     *a++ = 0xe4200001; /* beq */
854     }
855     if (instruction_type == HI6_BNE || instruction_type == HI6_BNEL) {
856     /* cmpeq rt,rs,t0 */
857     *a++ = 0x400005a1 | (alpha_rt << 21) | (alpha_rs << 16);
858     b = a;
859     *a++ = 0xf4200001; /* bne */
860     }
861     if (instruction_type == HI6_BLEZ || instruction_type == HI6_BLEZL) {
862     /* cmple rs,0,t0 */
863     *a++ = 0x40001da1 | (alpha_rs << 21);
864     b = a;
865     *a++ = 0xe4200001; /* beq */
866     }
867     if (instruction_type == HI6_BGTZ || instruction_type == HI6_BGTZL) {
868     /* cmple rs,0,t0 */
869     *a++ = 0x40001da1 | (alpha_rs << 21);
870     b = a;
871     *a++ = 0xf4200001; /* bne */
872     }
873     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BLTZ) {
874     /* cmplt rs,0,t0 */
875     *a++ = 0x400019a1 | (alpha_rs << 21);
876     b = a;
877     *a++ = 0xe4200001; /* beq */
878     }
879     if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BGEZ) {
880     *a++ = 0x207fffff; /* lda t2,-1 */
881     /* cmple rs,t2,t0 */
882     *a++ = 0x40030da1 | (alpha_rs << 21);
883     b = a;
884     *a++ = 0xf4200001; /* bne */
885     }
886    
887     /*
888     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
889     * and cpu->delay_jmpaddr = pc + 4 + (imm << 2).
890     *
891     * 04 00 26 20 lda t0,4(t5) add 4
892     * c8 01 5f 20 lda t1,456
893     * 4a 04 41 40 s4addq t1,t0,s1 s1 = (t1<<2) + t0
894     */
895    
896     *a++ = 0x20260004; /* lda t0,4(t5) */
897     *a++ = 0x205f0000 | (imm & 0xffff); /* lda */
898     *a++ = 0x4041044a; /* s4addq */
899    
900     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
901     *a++ = 0x213f0000 | TO_BE_DELAYED;
902    
903     /*
904     * Special case: "likely"-branches:
905     */
906     if (likely) {
907     c = a;
908     *a++ = 0xc3e00001; /* br delayed_ok */
909    
910     if (b != NULL)
911     *((unsigned char *)b) = ((size_t)a - (size_t)b - 4) / 4;
912    
913     /* cpu->cd.mips.nullify_next = 1; */
914     /* 01 00 3f 20 lda t0,1 */
915     *a++ = 0x203f0001;
916     ofs = (size_t)&dummy_cpu.cd.mips.nullify_next - (size_t)&dummy_cpu;
917     *a++ = 0xb0300000 | (ofs & 0xffff);
918    
919     /* fail, so that the next instruction is handled manually: */
920     *addrp = (unsigned char *) a;
921     bintrans_write_pc_inc(addrp);
922     bintrans_write_chunkreturn_fail(addrp);
923     a = (uint32_t *) *addrp;
924    
925     if (c != NULL)
926     *((unsigned char *)c) = ((size_t)a - (size_t)c - 4) / 4;
927     } else {
928     /* Normal (non-likely) exit: */
929     if (b != NULL)
930     *((unsigned char *)b) = ((size_t)a - (size_t)b - 4) / 4;
931     }
932    
933     *addrp = (unsigned char *) a;
934     bintrans_write_pc_inc(addrp);
935     return 1;
936     }
937    
938    
939     /*
940     * bintrans_write_instruction__jr():
941     */
942     static int bintrans_write_instruction__jr(unsigned char **addrp, int rs, int rd, int special)
943     {
944     uint32_t *a;
945     int alpha_rd;
946    
947     alpha_rd = map_MIPS_to_Alpha[rd];
948     if (alpha_rd < 0)
949     alpha_rd = ALPHA_T0;
950    
951     /*
952     * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
953     * and cpu->delay_jmpaddr = gpr[rs].
954     */
955    
956     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rs, ALPHA_S1);
957    
958     a = (uint32_t *) *addrp;
959     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
960     *a++ = 0x213f0000 | TO_BE_DELAYED;
961     *addrp = (unsigned char *) a;
962    
963     if (special == SPECIAL_JALR && rd != 0) {
964     /* gpr[rd] = retaddr (pc + 8) */
965     a = (uint32_t *) *addrp;
966     /* lda alpha_rd,8(t5) */
967     *a++ = 0x20060008 | (alpha_rd << 21);
968     *addrp = (unsigned char *) a;
969     if (alpha_rd == ALPHA_T0)
970     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rd);
971     }
972    
973     bintrans_write_pc_inc(addrp);
974     return 1;
975     }
976    
977    
978     /*
979     * bintrans_write_instruction__jal():
980     */
981     static int bintrans_write_instruction__jal(unsigned char **addrp,
982     int imm, int link)
983     {
984     uint32_t *a;
985    
986     a = (uint32_t *) *addrp;
987    
988     /* gpr[31] = retaddr (NOTE: mips register 31 is in alpha reg s3) */
989     if (link) {
990     *a++ = 0x21860008; /* lda s3,8(t5) */
991     }
992    
993     /* Set the jmpaddr to top 4 bits of pc + lowest 28 bits of imm*4: */
994    
995     /*
996     * imm = 4*imm;
997     * t0 = ((pc + 4) & ~0x0fffffff) | imm;
998     *
999     * 04 00 26 20 lda t0,4(t5) <-- because the jump is from the delay slot
1000     * 23 01 5f 24 ldah t1,291
1001     * 67 45 42 20 lda t1,17767(t1)
1002     * 00 f0 7f 24 ldah t2,-4096
1003     * 04 00 23 44 and t0,t2,t3
1004     * 0a 04 44 44 or t1,t3,s1
1005     */
1006     imm *= 4;
1007     *a++ = 0x20260004;
1008     *a++ = 0x245f0000 | ((imm >> 16) + (imm & 0x8000? 1 : 0));
1009     *a++ = 0x20420000 | (imm & 0xffff);
1010     *a++ = 0x247ff000;
1011     *a++ = 0x44230004;
1012     *a++ = 0x4444040a;
1013    
1014     /* 02 00 3f 21 lda s0,TO_BE_DELAYED */
1015     *a++ = 0x213f0000 | TO_BE_DELAYED;
1016    
1017     /* If the machine continues executing here, it will return
1018     to the main loop, which is fine. */
1019    
1020     *addrp = (unsigned char *) a;
1021     bintrans_write_pc_inc(addrp);
1022     return 1;
1023     }
1024    
1025    
1026     /*
1027     * bintrans_write_instruction__delayedbranch():
1028     */
1029     static int bintrans_write_instruction__delayedbranch(
1030     struct memory *mem, unsigned char **addrp,
1031     uint32_t *potential_chunk_p, uint32_t *chunks,
1032     int only_care_about_chunk_p, int p, int forward)
1033     {
1034     unsigned char *a, *skip=NULL, *generic64bit;
1035     int ofs;
1036     uint64_t alpha_addr, subaddr;
1037    
1038     a = *addrp;
1039    
1040     if (!only_care_about_chunk_p) {
1041     /* Skip all of this if there is no branch: */
1042     skip = a;
1043     *a++ = 0; *a++ = 0; *a++ = 0x20; *a++ = 0xe5; /* beq s0,skip */
1044    
1045     /*
1046     * Perform the jump by setting cpu->delay_slot = 0
1047     * and pc = cpu->delay_jmpaddr.
1048     */
1049     /* 00 00 3f 21 lda s0,0 */
1050     *a++ = 0; *a++ = 0; *a++ = 0x3f; *a++ = 0x21;
1051    
1052     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_DELAY_JMPADDR, ALPHA_T0);
1053     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_PC, ALPHA_T3);
1054     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, MIPSREG_PC);
1055     }
1056    
1057     if (potential_chunk_p == NULL) {
1058     if (mem->bintrans_32bit_only) {
1059     /* 34 12 70 a7 ldq t12,4660(a0) */
1060     ofs = (size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc - (size_t)&dummy_cpu;
1061     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1062    
1063     /* 00 00 fb 6b jmp (t12) */
1064     *a++ = 0; *a++ = 0; *a++ = 0xfb; *a++ = 0x6b;
1065     } else {
1066     /*
1067     * If the highest 32 bits of the address are either
1068     * 0x00000000 or 0xffffffff, then the tables used for
1069     * 32-bit load/stores can be used.
1070     *
1071     * 81 16 24 4a srl a1,0x20,t0
1072     * 03 00 20 e4 beq t0,14 <ok1>
1073     * 01 30 20 40 addl t0,0x1,t0
1074     * 01 00 20 e4 beq t0,14 <ok1>
1075     * 01 00 e0 c3 br 18 <nook>
1076     */
1077     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a;
1078     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4;
1079     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x40;
1080     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4;
1081     generic64bit = a;
1082     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1083    
1084     /* 34 12 70 a7 ldq t12,4660(a0) */
1085     ofs = (size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc - (size_t)&dummy_cpu;
1086     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1087    
1088     /* 00 00 fb 6b jmp (t12) */
1089     *a++ = 0; *a++ = 0; *a++ = 0xfb; *a++ = 0x6b;
1090    
1091    
1092     if (generic64bit != NULL)
1093     *generic64bit = ((size_t)a - (size_t)generic64bit - 4) / 4;
1094    
1095     /* Not much we can do here if this wasn't to the same
1096     physical page... */
1097    
1098     *a++ = 0xfc; *a++ = 0xff; *a++ = 0x84; *a++ = 0x20; /* lda t3,-4(t3) */
1099    
1100     /*
1101     * Compare the old pc (t3) and the new pc (t0). If they are on the
1102     * same virtual page (which means that they are on the same physical
1103     * page), then we can check the right chunk pointer, and if it
1104     * is non-NULL, then we can jump there. Otherwise just return.
1105     *
1106     * 00 f0 5f 20 lda t1,-4096
1107     * 01 00 22 44 and t0,t1,t0
1108     * 04 00 82 44 and t3,t1,t3
1109     * a3 05 24 40 cmpeq t0,t3,t2
1110     * 01 00 60 f4 bne t2,7c <ok2>
1111     * 01 80 fa 6b ret
1112     */
1113     *a++ = 0x00; *a++ = 0xf0; *a++ = 0x5f; *a++ = 0x20; /* lda */
1114     *a++ = 0x01; *a++ = 0x00; *a++ = 0x22; *a++ = 0x44; /* and */
1115     *a++ = 0x04; *a++ = 0x00; *a++ = 0x82; *a++ = 0x44; /* and */
1116     *a++ = 0xa3; *a++ = 0x05; *a++ = 0x24; *a++ = 0x40; /* cmpeq */
1117     *a++ = 0x01; *a++ = 0x00; *a++ = 0x60; *a++ = 0xf4; /* bne */
1118     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1119    
1120     /* Don't execute too many instructions. (see comment below) */
1121     *a++ = (N_SAFE_BINTRANS_LIMIT-1)&255; *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8)&255;
1122     *a++ = 0x5f; *a++ = 0x20; /* lda t1,0x1fff */
1123     *a++ = 0xa1; *a++ = 0x0d; *a++ = 0xe2; *a++ = 0x40; /* cmple t6,t1,t0 */
1124     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4; /* bne */
1125     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1126    
1127     /* 15 bits at a time, which means max 60 bits, but
1128     that should be enough. the top 4 bits are probably
1129     not used by userland alpha code. (TODO: verify this) */
1130     alpha_addr = (size_t)chunks;
1131     subaddr = (alpha_addr >> 45) & 0x7fff;
1132    
1133     /*
1134     * 00 00 3f 20 lda t0,0
1135     * 21 f7 21 48 sll t0,0xf,t0
1136     * 34 12 21 20 lda t0,4660(t0)
1137     * 21 f7 21 48 sll t0,0xf,t0
1138     * 34 12 21 20 lda t0,4660(t0)
1139     * 21 f7 21 48 sll t0,0xf,t0
1140     * 34 12 21 20 lda t0,4660(t0)
1141     */
1142    
1143     /* Start with the topmost 15 bits: */
1144     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x3f; *a++ = 0x20;
1145     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1146    
1147     subaddr = (alpha_addr >> 30) & 0x7fff;
1148     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1149     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1150    
1151     subaddr = (alpha_addr >> 15) & 0x7fff;
1152     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1153     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1154    
1155     subaddr = alpha_addr & 0x7fff;
1156     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1157    
1158     /*
1159     * t2 = pc
1160     * t1 = t2 & 0xfff
1161     * t0 += t1
1162     *
1163     * ff 0f 5f 20 lda t1,4095
1164     * 02 00 62 44 and t2,t1,t1
1165     * 01 04 22 40 addq t0,t1,t0
1166     */
1167     bintrans_move_MIPS_reg_into_Alpha_reg(&a, MIPSREG_PC, ALPHA_T2);
1168     *a++ = 0xff; *a++ = 0x0f; *a++ = 0x5f; *a++ = 0x20; /* lda */
1169     *a++ = 0x02; *a++ = 0x00; *a++ = 0x62; *a++ = 0x44; /* and */
1170     *a++ = 0x01; *a++ = 0x04; *a++ = 0x22; *a++ = 0x40; /* addq */
1171    
1172     /*
1173     * Load the chunk pointer (actually, a 32-bit offset) into t0.
1174     * If it is zero, then skip the following.
1175     * Add cpu->chunk_base_address to t0.
1176     * Jump to t0.
1177     */
1178    
1179     *a++ = 0x00; *a++ = 0x00; *a++ = 0x21; *a++ = 0xa0; /* ldl t0,0(t0) */
1180     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<skip> */
1181    
1182     /* ldq t2,chunk_base_address(a0) */
1183     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1184     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x70; *a++ = 0xa4;
1185     /* addq t0,t2,t0 */
1186     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x40;
1187    
1188     /* 00 00 e1 6b jmp (t0) */
1189     *a++ = 0x00; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x6b; /* jmp (t0) */
1190    
1191     /* Failure, then return to the main loop. */
1192     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1193     }
1194     } else {
1195     /*
1196     * Just to make sure that we don't become too unreliant
1197     * on the main program loop, we need to return every once
1198     * in a while (interrupts etc).
1199     *
1200     * Load the "nr of instructions executed" (which is an int)
1201     * and see if it is below a certain threshold. If so, then
1202     * we go on with the fast path (bintrans), otherwise we
1203     * abort by returning.
1204     *
1205     * f4 01 5f 20 lda t1,500 (some low number...)
1206     * a1 0d c2 40 cmple t6,t1,t0
1207     * 01 00 20 f4 bne t0,14 <f+0x14>
1208     */
1209     if (!only_care_about_chunk_p && !forward) {
1210     *a++ = (N_SAFE_BINTRANS_LIMIT-1)&255; *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8)&255;
1211     *a++ = 0x5f; *a++ = 0x20; /* lda t1,0x1fff */
1212     *a++ = 0xa1; *a++ = 0x0d; *a++ = 0xe2; *a++ = 0x40; /* cmple t6,t1,t0 */
1213     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4; /* bne */
1214     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1215     }
1216    
1217     /*
1218     * potential_chunk_p points to an "uint32_t".
1219     * If this value is non-NULL, then it is a piece of Alpha
1220     * machine language code corresponding to the address
1221     * we're jumping to. Otherwise, those instructions haven't
1222     * been translated yet, so we have to return to the main
1223     * loop. (Actually, we have to add cpu->chunk_base_address,
1224     * because the uint32_t is limited to 32-bit offsets.)
1225     *
1226     * Case 1: The value is non-NULL already at translation
1227     * time. Then we can make a direct (fast) native
1228     * Alpha jump to the code chunk.
1229     *
1230     * Case 2: The value was NULL at translation time, then we
1231     * have to check during runtime.
1232     */
1233    
1234     /* Case 1: */
1235     /* printf("%08x ", *potential_chunk_p); */
1236     alpha_addr = *potential_chunk_p + (size_t)mem->translation_code_chunk_space;
1237     ofs = (alpha_addr - ((size_t)a+4)) / 4;
1238     /* printf("%016llx %016llx %i\n", (long long)alpha_addr, (long long)a, ofs); */
1239    
1240     if ((*potential_chunk_p) != 0 && ofs > -0xfffff && ofs < 0xfffff) {
1241     *a++ = ofs & 255; *a++ = (ofs >> 8) & 255; *a++ = 0xe0 + ((ofs >> 16) & 0x1f); *a++ = 0xc3; /* br <chunk> */
1242     } else {
1243     /* Case 2: */
1244    
1245     bintrans_register_potential_quick_jump(mem, a, p);
1246    
1247     /* 15 bits at a time, which means max 60 bits, but
1248     that should be enough. the top 4 bits are probably
1249     not used by userland alpha code. (TODO: verify this) */
1250     alpha_addr = (size_t)potential_chunk_p;
1251     subaddr = (alpha_addr >> 45) & 0x7fff;
1252    
1253     /*
1254     * 00 00 3f 20 lda t0,0
1255     * 21 f7 21 48 sll t0,0xf,t0
1256     * 34 12 21 20 lda t0,4660(t0)
1257     * 21 f7 21 48 sll t0,0xf,t0
1258     * 34 12 21 20 lda t0,4660(t0)
1259     * 21 f7 21 48 sll t0,0xf,t0
1260     * 34 12 21 20 lda t0,4660(t0)
1261     */
1262    
1263     /* Start with the topmost 15 bits: */
1264     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x3f; *a++ = 0x20;
1265     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1266    
1267     subaddr = (alpha_addr >> 30) & 0x7fff;
1268     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1269     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1270    
1271     subaddr = (alpha_addr >> 15) & 0x7fff;
1272     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1273     *a++ = 0x21; *a++ = 0xf7; *a++ = 0x21; *a++ = 0x48; /* sll */
1274    
1275     subaddr = alpha_addr & 0x7fff;
1276     *a++ = (subaddr & 255); *a++ = (subaddr >> 8); *a++ = 0x21; *a++ = 0x20;
1277    
1278     /*
1279     * Load the chunk pointer into t0.
1280     * If it is NULL (zero), then skip the following jump.
1281     * Jump to t0.
1282     */
1283     *a++ = 0x00; *a++ = 0x00; *a++ = 0x21; *a++ = 0xa0; /* ldl t0,0(t0) */
1284     *a++ = 0x03; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<skip> */
1285    
1286     /* ldq t2,chunk_base_address(a0) */
1287     ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1288     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x70; *a++ = 0xa4;
1289     /* addq t0,t2,t0 */
1290     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x40;
1291    
1292     /* 00 00 e1 6b jmp (t0) */
1293     *a++ = 0x00; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x6b; /* jmp (t0) */
1294    
1295     /* "Failure", then let's return to the main loop. */
1296     *a++ = 0x01; *a++ = 0x80; *a++ = 0xfa; *a++ = 0x6b; /* ret */
1297     }
1298     }
1299    
1300     if (skip != NULL) {
1301     *skip = ((size_t)a - (size_t)skip - 4) / 4;
1302     skip ++;
1303     *skip = (((size_t)a - (size_t)skip - 4) / 4) >> 8;
1304     }
1305    
1306     *addrp = a;
1307     return 1;
1308     }
1309    
1310    
1311     /*
1312     * bintrans_write_instruction__loadstore():
1313     */
1314     static int bintrans_write_instruction__loadstore(
1315     struct memory *mem, unsigned char **addrp,
1316     int rt, int imm, int rs, int instruction_type, int bigendian)
1317     {
1318     unsigned char *a, *fail, *generic64bit = NULL, *generic64bitA = NULL;
1319     unsigned char *doloadstore = NULL,
1320     *ok_unaligned_load3, *ok_unaligned_load2, *ok_unaligned_load1;
1321     uint32_t *b;
1322     int ofs, alignment, load = 0, alpha_rs, alpha_rt, unaligned = 0;
1323    
1324     /* TODO: Not yet: */
1325     if (instruction_type == HI6_LQ_MDMX || instruction_type == HI6_SQ) {
1326     return 0;
1327     }
1328    
1329     switch (instruction_type) {
1330     case HI6_LQ_MDMX:
1331     case HI6_LD:
1332     case HI6_LDL:
1333     case HI6_LDR:
1334     case HI6_LWU:
1335     case HI6_LW:
1336     case HI6_LWL:
1337     case HI6_LWR:
1338     case HI6_LHU:
1339     case HI6_LH:
1340     case HI6_LBU:
1341     case HI6_LB:
1342     load = 1;
1343     if (rt == 0)
1344     return 0;
1345     }
1346    
1347     switch (instruction_type) {
1348     case HI6_LDL:
1349     case HI6_LDR:
1350     case HI6_LWL:
1351     case HI6_LWR:
1352     case HI6_SDL:
1353     case HI6_SDR:
1354     case HI6_SWL:
1355     case HI6_SWR:
1356     unaligned = 1;
1357     }
1358    
1359     a = *addrp;
1360    
1361     /*
1362     * a1 = gpr[rs] + imm;
1363     *
1364     * 88 08 30 a4 ldq t0,2184(a0)
1365     * 34 12 21 22 lda a1,4660(t0)
1366     */
1367    
1368     alpha_rs = map_MIPS_to_Alpha[rs];
1369     if (alpha_rs < 0) {
1370     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1371     alpha_rs = ALPHA_T0;
1372     }
1373     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1374    
1375     alignment = 0;
1376     switch (instruction_type) {
1377     case HI6_LQ_MDMX:
1378     case HI6_SQ:
1379     alignment = 15;
1380     break;
1381     case HI6_LD:
1382     case HI6_LDL:
1383     case HI6_LDR:
1384     case HI6_SD:
1385     case HI6_SDL:
1386     case HI6_SDR:
1387     alignment = 7;
1388     break;
1389     case HI6_LW:
1390     case HI6_LWL:
1391     case HI6_LWR:
1392     case HI6_LWU:
1393     case HI6_SW:
1394     case HI6_SWL:
1395     case HI6_SWR:
1396     alignment = 3;
1397     break;
1398     case HI6_LH:
1399     case HI6_LHU:
1400     case HI6_SH:
1401     alignment = 1;
1402     break;
1403     }
1404    
1405     if (unaligned) {
1406     /*
1407     * Unaligned load/store: Perform the host load/store at
1408     * an aligned address, and then figure out which bytes to
1409     * actually load into the destination register.
1410     *
1411     * 02 30 20 46 and a1,alignment,t1
1412     * 31 05 22 42 subq a1,t1,a1
1413     */
1414     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1415     *a++ = 0x31; *a++ = 0x05; *a++ = 0x22; *a++ = 0x42;
1416     } else if (alignment > 0) {
1417     /*
1418     * Check alignment:
1419     *
1420     * 02 30 20 46 and a1,0x1,t1
1421     * 02 70 20 46 and a1,0x3,t1 (one of these "and"s)
1422     * 02 f0 20 46 and a1,0x7,t1
1423     * 02 f0 21 46 and a1,0xf,t1
1424     * 01 00 40 e4 beq t1,<okalign>
1425     * 01 80 fa 6b ret
1426     */
1427     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1428     fail = a;
1429     *a++ = 0x01; *a++ = 0x00; *a++ = 0x40; *a++ = 0xe4;
1430     *addrp = a;
1431     bintrans_write_chunkreturn_fail(addrp);
1432     a = *addrp;
1433     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1434     }
1435    
1436     alpha_rt = map_MIPS_to_Alpha[rt];
1437    
1438     if (mem->bintrans_32bit_only) {
1439     /* Special case for 32-bit addressing: */
1440    
1441     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_loadstore_32bit) - (size_t)&dummy_cpu;
1442     /* ldq t12,bintrans_loadstore_32bit(a0) */
1443     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1444    
1445     /* jsr t4,(t12),<after> */
1446     *a++ = 0x00; *a++ = 0x40; *a++ = 0xbb; *a++ = 0x68;
1447    
1448     /*
1449     * Now:
1450     * a3 = host page
1451     * t0 = 0 for readonly pages, 1 for read/write pages
1452     * t3 = address of host load/store
1453     */
1454    
1455     /* If this is a store, then the lowest bit must be set: */
1456     if (!load) {
1457     /* 01 00 20 f4 bne t0,<okzzz> */
1458     fail = a;
1459     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4;
1460     bintrans_write_chunkreturn_fail(&a);
1461     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1462     }
1463     } else {
1464     /*
1465     * If the highest 33 bits of the address are either all ones
1466     * or all zeroes, then the tables used for 32-bit load/stores
1467     * can be used.
1468     */
1469     *a++ = 0x81; *a++ = 0xf6; *a++ = 0x23; *a++ = 0x4a; /* srl a1,0x1f,t0 */
1470     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x44; /* and t0,0x1,t0 */
1471     *a++ = 0x04; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<noll> */
1472     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a; /* srl a1,0x20,t0 */
1473     *a++ = 0x01; *a++ = 0x30; *a++ = 0x20; *a++ = 0x40; /* addl t0,0x1,t0 */
1474     *a++ = 0x04; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<ok> */
1475     generic64bit = a;
1476     *a++ = 0x04; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3; /* br <generic> */
1477     /* <noll>: */
1478     *a++ = 0x81; *a++ = 0x16; *a++ = 0x24; *a++ = 0x4a; /* srl a1,0x20,t0 */
1479     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xe4; /* beq t0,<ok> */
1480     generic64bitA = a;
1481     *a++ = 0x04; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3; /* br <generic> */
1482    
1483     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_loadstore_32bit) - (size_t)&dummy_cpu;
1484     /* ldq t12,bintrans_loadstore_32bit(a0) */
1485     *a++ = ofs; *a++ = ofs >> 8; *a++ = 0x70; *a++ = 0xa7;
1486    
1487     /* jsr t4,(t12),<after> */
1488     *a++ = 0x00; *a++ = 0x40; *a++ = 0xbb; *a++ = 0x68;
1489    
1490     /*
1491     * Now:
1492     * a3 = host page (or NULL if not found)
1493     * t0 = 0 for readonly pages, 1 for read/write pages
1494     * t3 = (potential) address of host load/store
1495     */
1496    
1497     /* If this is a store, then the lowest bit must be set: */
1498     if (!load) {
1499     /* 01 00 20 f4 bne t0,<okzzz> */
1500     fail = a;
1501     *a++ = 0x01; *a++ = 0x00; *a++ = 0x20; *a++ = 0xf4;
1502     bintrans_write_chunkreturn_fail(&a);
1503     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1504     }
1505    
1506     doloadstore = a;
1507     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1508    
1509    
1510     /*
1511     * Generic (64-bit) load/store:
1512     */
1513    
1514     if (generic64bit != NULL)
1515     *generic64bit = ((size_t)a - (size_t)generic64bit - 4) / 4;
1516     if (generic64bitA != NULL)
1517     *generic64bitA = ((size_t)a - (size_t)generic64bitA - 4) / 4;
1518    
1519     *addrp = a;
1520     b = (uint32_t *) *addrp;
1521    
1522     /* Save a0 and the old return address on the stack: */
1523     *b++ = 0x23deff80; /* lda sp,-128(sp) */
1524    
1525     *b++ = 0xb75e0000; /* stq ra,0(sp) */
1526     *b++ = 0xb61e0008; /* stq a0,8(sp) */
1527     *b++ = 0xb4de0010; /* stq t5,16(sp) */
1528     *b++ = 0xb0fe0018; /* stl t6,24(sp) */
1529     *b++ = 0xb71e0020; /* stq t10,32(sp) */
1530     *b++ = 0xb73e0028; /* stq t11,40(sp) */
1531     *b++ = 0xb51e0030; /* stq t7,48(sp) */
1532     *b++ = 0xb6de0038; /* stq t8,56(sp) */
1533     *b++ = 0xb6fe0040; /* stq t9,64(sp) */
1534    
1535     ofs = ((size_t)&dummy_cpu.cd.mips.fast_vaddr_to_hostaddr) - (size_t)&dummy_cpu;
1536    
1537     *b++ = 0xa7700000 | ofs; /* ldq t12,0(a0) */
1538    
1539     /* a1 is already vaddr. set a2 = writeflag */
1540     *b++ = 0x225f0000 | (load? 0 : 1);
1541    
1542     /* Call fast_vaddr_to_hostaddr: */
1543     *b++ = 0x6b5b4000; /* jsr ra,(t12),<after> */
1544    
1545     /* Restore the old return address and a0 from the stack: */
1546     *b++ = 0xa75e0000; /* ldq ra,0(sp) */
1547     *b++ = 0xa61e0008; /* ldq a0,8(sp) */
1548     *b++ = 0xa4de0010; /* ldq t5,16(sp) */
1549     *b++ = 0xa0fe0018; /* ldl t6,24(sp) */
1550     *b++ = 0xa71e0020; /* ldq t10,32(sp) */
1551     *b++ = 0xa73e0028; /* ldq t11,40(sp) */
1552     *b++ = 0xa51e0030; /* ldq t7,48(sp) */
1553     *b++ = 0xa6de0038; /* ldq t8,56(sp) */
1554     *b++ = 0xa6fe0040; /* ldq t9,64(sp) */
1555    
1556     *b++ = 0x23de0080; /* lda sp,128(sp) */
1557    
1558     *addrp = (unsigned char *) b;
1559     a = *addrp;
1560    
1561     /*
1562     * NULL? Then return failure.
1563     * 01 00 00 f4 bne v0,f8 <okzz>
1564     */
1565     fail = a;
1566     *a++ = 0x01; *a++ = 0x00; *a++ = 0x00; *a++ = 0xf4;
1567     bintrans_write_chunkreturn_fail(&a);
1568     *fail = ((size_t)a - (size_t)fail - 4) / 4;
1569    
1570     /* The rest of this code was written with t3 as the address. */
1571    
1572     /* 04 14 00 40 addq v0,0,t3 */
1573     *a++ = 0x04; *a++ = 0x14; *a++ = 0x00; *a++ = 0x40;
1574    
1575     if (doloadstore != NULL)
1576     *doloadstore = ((size_t)a - (size_t)doloadstore - 4) / 4;
1577     }
1578    
1579    
1580     switch (instruction_type) {
1581     case HI6_LQ_MDMX:
1582     /* TODO */
1583     break;
1584     case HI6_LD:
1585     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa4; /* ldq t0,0(t3) */
1586     if (bigendian) {
1587     /* remember original 8 bytes of t0: */
1588     *a++ = 0x05; *a++ = 0x04; *a++ = 0x3f; *a++ = 0x40; /* addq t0,zero,t4 */
1589    
1590     /* swap lowest 4 bytes: */
1591     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1592     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1593     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1594     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1595     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1596     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1597     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1598     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1599     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1600    
1601     /* save result in (top 4 bytes of) t1, then t4. get back top bits of t4: */
1602     *a++ = 0x22; *a++ = 0x17; *a++ = 0x24; *a++ = 0x48; /* sll t0,0x20,t1 */
1603     *a++ = 0x81; *a++ = 0x16; *a++ = 0xa4; *a++ = 0x48; /* srl t4,0x20,t0 */
1604     *a++ = 0x05; *a++ = 0x14; *a++ = 0x40; *a++ = 0x40; /* addq t1,0,t4 */
1605    
1606     /* swap highest 4 bytes: */
1607     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1608     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1609     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1610     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1611     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1612     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1613     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1614     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1615     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1616    
1617     /* or the results together: */
1618     *a++ = 0x01; *a++ = 0x04; *a++ = 0xa1; *a++ = 0x44; /* or t4,t0,t0 */
1619     }
1620     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1621     break;
1622     case HI6_LW:
1623     case HI6_LWU:
1624     if (alpha_rt < 0 || bigendian || instruction_type == HI6_LWU)
1625     alpha_rt = ALPHA_T0;
1626     /* ldl rt,0(t3) */
1627     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1628     *a++ = 0xa0 | ((alpha_rt >> 3) & 3);
1629     if (bigendian) {
1630     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1631     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1632     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1633     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1634     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1635     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1636     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1637     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1638     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1639     *a++ = 0x01; *a++ = 0x00; *a++ = 0x3f; *a++ = 0x40; /* addl t0,zero,t0 (sign extend) 32->64 */
1640     }
1641     if (instruction_type == HI6_LWU) {
1642     /* Use only lowest 32 bits: */
1643     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x21; *a++ = 0x48; /* zapnot t0,0xf,t0 */
1644     }
1645     if (alpha_rt == ALPHA_T0)
1646     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1647     break;
1648     case HI6_LHU:
1649     case HI6_LH:
1650     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0x30; /* ldwu from memory */
1651     if (bigendian) {
1652     *a++ = 0x62; *a++ = 0x31; *a++ = 0x20; *a++ = 0x48; /* insbl t0,1,t1 */
1653     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1654     *a++ = 0x01; *a++ = 0x04; *a++ = 0x43; *a++ = 0x44; /* or t1,t2,t0 */
1655     }
1656     if (instruction_type == HI6_LH) {
1657     *a++ = 0x21; *a++ = 0x00; *a++ = 0xe1; *a++ = 0x73; /* sextw t0,t0 */
1658     }
1659     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1660     break;
1661     case HI6_LBU:
1662     case HI6_LB:
1663     if (alpha_rt < 0)
1664     alpha_rt = ALPHA_T0;
1665     /* ldbu rt,0(t3) */
1666     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1667     *a++ = 0x28 | ((alpha_rt >> 3) & 3);
1668     if (instruction_type == HI6_LB) {
1669     /* sextb rt,rt */
1670     *a++ = alpha_rt; *a++ = 0x00; *a++ = 0xe0 + alpha_rt; *a++ = 0x73;
1671     }
1672     if (alpha_rt == ALPHA_T0)
1673     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rt);
1674     break;
1675    
1676     case HI6_LWL:
1677     /* a1 = 0..3 (or 0..7 for 64-bit loads): */
1678     alpha_rs = map_MIPS_to_Alpha[rs];
1679     if (alpha_rs < 0) {
1680     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1681     alpha_rs = ALPHA_T0;
1682     }
1683     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1684     /* 02 30 20 46 and a1,alignment,t1 */
1685     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1686    
1687     /* ldl t0,0(t3) */
1688     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
1689    
1690     if (bigendian) {
1691     /* TODO */
1692     bintrans_write_chunkreturn_fail(&a);
1693     }
1694     /*
1695     * lwl: memory = 0x12 0x34 0x56 0x78
1696     * offset (a1): register rt becomes:
1697     * 0 0x12......
1698     * 1 0x3412....
1699     * 2 0x563412..
1700     * 3 0x78563412
1701     */
1702    
1703     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
1704    
1705     /*
1706     10: 03 00 9f 20 lda t3,3
1707     14: a5 05 82 40 cmpeq t3,t1,t4
1708     18: 01 00 a0 e4 beq t4,20 <skip>
1709     */
1710     *a++ = 0x03; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1711     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1712     *a++ = 0x02; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1713    
1714     /* 03 14 20 40 addq t0,0,t2 */
1715     *a++ = 0x03; *a++ = 0x14; *a++ = 0x20; *a++ = 0x40;
1716    
1717     ok_unaligned_load3 = a;
1718     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1719    
1720    
1721    
1722     *a++ = 0x02; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1723     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1724     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1725     /*
1726     * 2 0x563412..
1727     2c: 21 17 21 48 sll t0,0x8,t0
1728     30: 01 10 20 40 addl t0,0,t0
1729     34: 03 f0 7f 44 and t2,0xff,t2
1730     38: 03 04 23 44 or t0,t2,t2
1731     */
1732     *a++ = 0x21; *a++ = 0x17; *a++ = 0x21; *a++ = 0x48;
1733     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1734     *a++ = 0x03; *a++ = 0xf0; *a++ = 0x7f; *a++ = 0x44;
1735     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1736    
1737     ok_unaligned_load2 = a;
1738     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1739    
1740    
1741    
1742     *a++ = 0x01; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1743     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1744     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1745     /*
1746     * 1 0x3412....
1747     2c: 21 17 22 48 sll t0,0x10,t0
1748     30: 01 10 20 40 addl t0,0,t0
1749     34: 23 76 60 48 zapnot t2,0x3,t2
1750     38: 03 04 23 44 or t0,t2,t2
1751     */
1752     *a++ = 0x21; *a++ = 0x17; *a++ = 0x22; *a++ = 0x48;
1753     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1754     *a++ = 0x23; *a++ = 0x76; *a++ = 0x60; *a++ = 0x48;
1755     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1756    
1757     ok_unaligned_load1 = a;
1758     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1759    
1760    
1761    
1762    
1763     /*
1764     * 0 0x12......
1765     2c: 21 17 23 48 sll t0,0x18,t0
1766     30: 01 10 20 40 addl t0,0,t0
1767     34: 23 f6 60 48 zapnot t2,0x7,t2
1768     38: 03 04 23 44 or t0,t2,t2
1769     */
1770     *a++ = 0x21; *a++ = 0x17; *a++ = 0x23; *a++ = 0x48;
1771     *a++ = 0x01; *a++ = 0x10; *a++ = 0x20; *a++ = 0x40;
1772     *a++ = 0x23; *a++ = 0xf6; *a++ = 0x60; *a++ = 0x48;
1773     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1774    
1775    
1776     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
1777     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
1778     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
1779    
1780     /* 03 10 60 40 addl t2,0,t2 */
1781     *a++ = 0x03; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
1782    
1783     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T2, rt);
1784     break;
1785    
1786     case HI6_LWR:
1787     /* a1 = 0..3 (or 0..7 for 64-bit loads): */
1788     alpha_rs = map_MIPS_to_Alpha[rs];
1789     if (alpha_rs < 0) {
1790     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1791     alpha_rs = ALPHA_T0;
1792     }
1793     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1794     /* 02 30 20 46 and a1,alignment,t1 */
1795     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1796    
1797     /* ldl t0,0(t3) */
1798     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
1799    
1800     if (bigendian) {
1801     /* TODO */
1802     bintrans_write_chunkreturn_fail(&a);
1803     }
1804     /*
1805     * lwr: memory = 0x12 0x34 0x56 0x78
1806     * offset (a1): register rt becomes:
1807     * 0 0x78563412
1808     * 1 0x..785634
1809     * 2 0x....7856
1810     * 3 0x......78
1811     */
1812    
1813     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
1814    
1815     /*
1816     10: 03 00 9f 20 lda t3,3
1817     14: a5 05 82 40 cmpeq t3,t1,t4
1818     18: 01 00 a0 e4 beq t4,20 <skip>
1819     */
1820     *a++ = 0x03; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1821     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1822     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1823    
1824     /*
1825     2c: 81 16 23 48 srl t0,0x18,t0
1826     b0: 21 36 20 48 zapnot t0,0x1,t0
1827     34: 23 d6 7f 48 zapnot t2,0xfe,t2
1828     38: 03 04 23 44 or t0,t2,t2
1829     */
1830     *a++ = 0x81; *a++ = 0x16; *a++ = 0x23; *a++ = 0x48;
1831     *a++ = 0x21; *a++ = 0x36; *a++ = 0x20; *a++ = 0x48;
1832     *a++ = 0x23; *a++ = 0xd6; *a++ = 0x7f; *a++ = 0x48;
1833     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1834    
1835     ok_unaligned_load3 = a;
1836     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1837    
1838    
1839    
1840     *a++ = 0x02; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1841     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1842     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1843     /*
1844     2c: 81 16 22 48 srl t0,0x10,t0
1845     b4: 21 76 20 48 zapnot t0,0x3,t0
1846     34: 23 96 7f 48 zapnot t2,0xfc,t2
1847     38: 03 04 23 44 or t0,t2,t2
1848     */
1849     *a++ = 0x81; *a++ = 0x16; *a++ = 0x22; *a++ = 0x48;
1850     *a++ = 0x21; *a++ = 0x76; *a++ = 0x20; *a++ = 0x48;
1851     *a++ = 0x23; *a++ = 0x96; *a++ = 0x7f; *a++ = 0x48;
1852     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1853    
1854     ok_unaligned_load2 = a;
1855     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1856    
1857    
1858    
1859     *a++ = 0x01; *a++ = 0x00; *a++ = 0x9f; *a++ = 0x20;
1860     *a++ = 0xa5; *a++ = 0x05; *a++ = 0x82; *a++ = 0x40;
1861     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
1862     /*
1863     2c: 81 16 21 48 srl t0,0x8,t0
1864     b8: 21 f6 20 48 zapnot t0,0x7,t0
1865     3c: 23 16 7f 48 zapnot t2,0xf8,t2
1866     40: 03 04 23 44 or t0,t2,t2
1867     */
1868     *a++ = 0x81; *a++ = 0x16; *a++ = 0x21; *a++ = 0x48;
1869     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x20; *a++ = 0x48;
1870     *a++ = 0x23; *a++ = 0x16; *a++ = 0x7f; *a++ = 0x48;
1871     *a++ = 0x03; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
1872    
1873     ok_unaligned_load1 = a;
1874     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
1875    
1876    
1877    
1878    
1879     /*
1880     * 0 0x12......
1881     */
1882     /* 03 14 20 40 addq t0,0,t2 */
1883     *a++ = 0x03; *a++ = 0x14; *a++ = 0x20; *a++ = 0x40;
1884    
1885    
1886    
1887     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
1888     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
1889     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
1890    
1891     /* 03 10 60 40 addl t2,0,t2 */
1892     *a++ = 0x03; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
1893    
1894     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T2, rt);
1895     break;
1896    
1897     case HI6_SQ:
1898     /* TODO */
1899     break;
1900     case HI6_SD:
1901     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1902     if (bigendian) {
1903     /* remember original 8 bytes of t0: */
1904     *a++ = 0x05; *a++ = 0x04; *a++ = 0x3f; *a++ = 0x40; /* addq t0,zero,t4 */
1905    
1906     /* swap lowest 4 bytes: */
1907     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1908     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1909     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1910     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1911     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1912     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1913     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1914     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1915     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1916    
1917     /* save result in (top 4 bytes of) t1, then t4. get back top bits of t4: */
1918     *a++ = 0x22; *a++ = 0x17; *a++ = 0x24; *a++ = 0x48; /* sll t0,0x20,t1 */
1919     *a++ = 0x81; *a++ = 0x16; *a++ = 0xa4; *a++ = 0x48; /* srl t4,0x20,t0 */
1920     *a++ = 0x05; *a++ = 0x14; *a++ = 0x40; *a++ = 0x40; /* addq t1,0,t4 */
1921    
1922     /* swap highest 4 bytes: */
1923     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1924     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1925     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1926     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1927     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1928     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1929     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1930     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1931     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1932    
1933     /* or the results together: */
1934     *a++ = 0x01; *a++ = 0x04; *a++ = 0xa1; *a++ = 0x44; /* or t4,t0,t0 */
1935     }
1936     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb4; /* stq to memory */
1937     break;
1938     case HI6_SW:
1939     if (alpha_rt < 0 || bigendian) {
1940     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1941     alpha_rt = ALPHA_T0;
1942     }
1943     if (bigendian) {
1944     *a++ = 0x62; *a++ = 0x71; *a++ = 0x20; *a++ = 0x48; /* insbl t0,3,t1 */
1945     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1946     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48; /* sll t2,16,t2 */
1947     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1948     *a++ = 0xc3; *a++ = 0x50; *a++ = 0x20; *a++ = 0x48; /* extbl t0,2,t2 */
1949     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48; /* sll t2,8,t2 */
1950     *a++ = 0x02; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t1 */
1951     *a++ = 0xc3; *a++ = 0x70; *a++ = 0x20; *a++ = 0x48; /* extbl t0,3,t2 */
1952     *a++ = 0x01; *a++ = 0x04; *a++ = 0x62; *a++ = 0x44; /* or t2,t1,t0 */
1953     }
1954     /* stl to memory: stl rt,0(t3) */
1955     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1956     *a++ = 0xb0 | ((alpha_rt >> 3) & 3);
1957     break;
1958     case HI6_SH:
1959     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1960     if (bigendian) {
1961     *a++ = 0x62; *a++ = 0x31; *a++ = 0x20; *a++ = 0x48; /* insbl t0,1,t1 */
1962     *a++ = 0xc3; *a++ = 0x30; *a++ = 0x20; *a++ = 0x48; /* extbl t0,1,t2 */
1963     *a++ = 0x01; *a++ = 0x04; *a++ = 0x43; *a++ = 0x44; /* or t1,t2,t0 */
1964     }
1965     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0x34; /* stw to memory */
1966     break;
1967     case HI6_SB:
1968     if (alpha_rt < 0) {
1969     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T0);
1970     alpha_rt = ALPHA_T0;
1971     }
1972     /* stb to memory: stb rt,0(t3) */
1973     *a++ = 0x00; *a++ = 0x00; *a++ = 0x04 | ((alpha_rt & 7) << 5);
1974     *a++ = 0x38 | ((alpha_rt >> 3) & 3);
1975     break;
1976    
1977     case HI6_SWL:
1978     /* a1 = 0..3 (or 0..7 for 64-bit stores): */
1979     alpha_rs = map_MIPS_to_Alpha[rs];
1980     if (alpha_rs < 0) {
1981     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
1982     alpha_rs = ALPHA_T0;
1983     }
1984     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
1985     /* 02 30 20 46 and a1,alignment,t1 */
1986     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
1987    
1988     /* ldl t0,0(t3) */
1989     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
1990    
1991     if (bigendian) {
1992     /* TODO */
1993     bintrans_write_chunkreturn_fail(&a);
1994     }
1995    
1996     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
1997    
1998     /*
1999     * swl: memory = 0x12 0x34 0x56 0x78
2000     * register = 0x89abcdef
2001     * offset (a1): memory becomes:
2002     * 0 0x89 0x.. 0x.. 0x..
2003     * 1 0xab 0x89 0x.. 0x..
2004     * 2 0xcd 0xab 0x89 0x..
2005     * 3 0xef 0xcd 0xab 0x89
2006     */
2007    
2008     /*
2009     a5 75 40 40 cmpeq t1,0x03,t4
2010     01 00 a0 e4 beq t4,20 <skip>
2011     */
2012     *a++ = 0xa5; *a++ = 0x75; *a++ = 0x40; *a++ = 0x40;
2013     *a++ = 0x02; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2014    
2015     /* 01 10 60 40 addl t2,0,t0 */
2016     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
2017    
2018     ok_unaligned_load3 = a;
2019     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2020    
2021    
2022    
2023    
2024     *a++ = 0xa5; *a++ = 0x55; *a++ = 0x40; *a++ = 0x40;
2025     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2026     /*
2027     2:
2028     e8: 83 16 61 48 srl t2,0x8,t2
2029     ec: 23 f6 60 48 zapnot t2,0x7,t2
2030     f0: 21 16 3f 48 zapnot t0,0xf8,t0
2031     f4: 01 04 23 44 or t0,t2,t0
2032     */
2033     *a++ = 0x83; *a++ = 0x16; *a++ = 0x61; *a++ = 0x48;
2034     *a++ = 0x23; *a++ = 0xf6; *a++ = 0x60; *a++ = 0x48;
2035     *a++ = 0x21; *a++ = 0x16; *a++ = 0x3f; *a++ = 0x48;
2036     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2037    
2038     ok_unaligned_load2 = a;
2039     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2040    
2041    
2042    
2043     *a++ = 0xa5; *a++ = 0x35; *a++ = 0x40; *a++ = 0x40;
2044     *a++ = 0x05; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2045     /*
2046     1:
2047     f8: 83 16 62 48 srl t2,0x10,t2
2048     fc: 23 76 60 48 zapnot t2,0x3,t2
2049     100: 21 96 3f 48 zapnot t0,0xfc,t0
2050     104: 01 04 23 44 or t0,t2,t0
2051     */
2052     *a++ = 0x83; *a++ = 0x16; *a++ = 0x62; *a++ = 0x48;
2053     *a++ = 0x23; *a++ = 0x76; *a++ = 0x60; *a++ = 0x48;
2054     *a++ = 0x21; *a++ = 0x96; *a++ = 0x3f; *a++ = 0x48;
2055     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2056    
2057     ok_unaligned_load1 = a;
2058     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2059    
2060    
2061    
2062    
2063    
2064     /*
2065     0:
2066     108: 83 16 63 48 srl t2,0x18,t2
2067     10c: 23 36 60 48 zapnot t2,0x1,t2
2068     110: 21 d6 3f 48 zapnot t0,0xfe,t0
2069     114: 01 04 23 44 or t0,t2,t0
2070     */
2071     *a++ = 0x83; *a++ = 0x16; *a++ = 0x63; *a++ = 0x48;
2072     *a++ = 0x23; *a++ = 0x36; *a++ = 0x60; *a++ = 0x48;
2073     *a++ = 0x21; *a++ = 0xd6; *a++ = 0x3f; *a++ = 0x48;
2074     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2075    
2076    
2077     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
2078     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
2079     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
2080    
2081     /* sdl t0,0(t3) */
2082     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb0;
2083     break;
2084    
2085     case HI6_SWR:
2086     /* a1 = 0..3 (or 0..7 for 64-bit stores): */
2087     alpha_rs = map_MIPS_to_Alpha[rs];
2088     if (alpha_rs < 0) {
2089     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rs, ALPHA_T0);
2090     alpha_rs = ALPHA_T0;
2091     }
2092     *a++ = imm; *a++ = (imm >> 8); *a++ = 0x20 + alpha_rs; *a++ = 0x22;
2093     /* 02 30 20 46 and a1,alignment,t1 */
2094     *a++ = 0x02; *a++ = 0x10 + alignment * 0x20; *a++ = 0x20 + (alignment >> 3); *a++ = 0x46;
2095    
2096     /* ldl t0,0(t3) */
2097     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xa0;
2098    
2099     if (bigendian) {
2100     /* TODO */
2101     bintrans_write_chunkreturn_fail(&a);
2102     }
2103    
2104     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rt, ALPHA_T2);
2105    
2106     /*
2107     * swr: memory = 0x12 0x34 0x56 0x78
2108     * register = 0x89abcdef
2109     * offset (a1): memory becomes:
2110     * 0 0xef 0xcd 0xab 0x89
2111     * 1 0x.. 0xef 0xcd 0xab
2112     * 2 0x.. 0x.. 0xef 0xcd
2113     * 3 0x.. 0x.. 0x.. 0xef
2114     */
2115    
2116    
2117     /*
2118     a5 75 40 40 cmpeq t1,0x03,t4
2119     01 00 a0 e4 beq t4,20 <skip>
2120     */
2121     *a++ = 0xa5; *a++ = 0x75; *a++ = 0x40; *a++ = 0x40;
2122     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2123    
2124     /*
2125     118: 23 17 63 48 sll t2,0x18,t2
2126     11c: 21 f6 20 48 zapnot t0,0x7,t0
2127     120: 01 04 23 44 or t0,t2,t0
2128     */
2129     *a++ = 0x23; *a++ = 0x17; *a++ = 0x63; *a++ = 0x48;
2130     *a++ = 0x21; *a++ = 0xf6; *a++ = 0x20; *a++ = 0x48;
2131     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2132    
2133     ok_unaligned_load3 = a;
2134     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2135    
2136    
2137    
2138    
2139    
2140     *a++ = 0xa5; *a++ = 0x55; *a++ = 0x40; *a++ = 0x40;
2141     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2142     /*
2143     2:
2144     124: 23 17 62 48 sll t2,0x10,t2
2145     128: 21 76 20 48 zapnot t0,0x3,t0
2146     12c: 01 04 23 44 or t0,t2,t0
2147     */
2148     *a++ = 0x23; *a++ = 0x17; *a++ = 0x62; *a++ = 0x48;
2149     *a++ = 0x21; *a++ = 0x76; *a++ = 0x20; *a++ = 0x48;
2150     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2151    
2152     ok_unaligned_load2 = a;
2153     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2154    
2155    
2156    
2157     *a++ = 0xa5; *a++ = 0x35; *a++ = 0x40; *a++ = 0x40;
2158     *a++ = 0x04; *a++ = 0x00; *a++ = 0xa0; *a++ = 0xe4;
2159     /*
2160     1:
2161     130: 23 17 61 48 sll t2,0x8,t2
2162     134: 21 36 20 48 zapnot t0,0x1,t0
2163     138: 01 04 23 44 or t0,t2,t0
2164     */
2165     *a++ = 0x23; *a++ = 0x17; *a++ = 0x61; *a++ = 0x48;
2166     *a++ = 0x21; *a++ = 0x36; *a++ = 0x20; *a++ = 0x48;
2167     *a++ = 0x01; *a++ = 0x04; *a++ = 0x23; *a++ = 0x44;
2168    
2169     ok_unaligned_load1 = a;
2170     *a++ = 0x01; *a++ = 0x00; *a++ = 0xe0; *a++ = 0xc3;
2171    
2172    
2173    
2174     /*
2175     0:
2176     13c: 01 10 60 40 addl t2,0,t0
2177     */
2178     *a++ = 0x01; *a++ = 0x10; *a++ = 0x60; *a++ = 0x40;
2179    
2180    
2181     *ok_unaligned_load3 = ((size_t)a - (size_t)ok_unaligned_load3 - 4) / 4;
2182     *ok_unaligned_load2 = ((size_t)a - (size_t)ok_unaligned_load2 - 4) / 4;
2183     *ok_unaligned_load1 = ((size_t)a - (size_t)ok_unaligned_load1 - 4) / 4;
2184    
2185     /* sdl t0,0(t3) */
2186     *a++ = 0x00; *a++ = 0x00; *a++ = 0x24; *a++ = 0xb0;
2187     break;
2188    
2189     default:
2190     ;
2191     }
2192    
2193     *addrp = a;
2194     bintrans_write_pc_inc(addrp);
2195     return 1;
2196     }
2197    
2198    
2199     /*
2200     * bintrans_write_instruction__lui():
2201     */
2202     static int bintrans_write_instruction__lui(unsigned char **addrp,
2203     int rt, int imm)
2204     {
2205     uint32_t *a;
2206    
2207     /*
2208     * dc fe 3f 24 ldah t0,-292
2209     * 1f 04 ff 5f fnop
2210     * 88 08 30 b4 stq t0,2184(a0)
2211     */
2212     if (rt != 0) {
2213     int alpha_rt = map_MIPS_to_Alpha[rt];
2214     if (alpha_rt < 0)
2215     alpha_rt = ALPHA_T0;
2216    
2217     a = (uint32_t *) *addrp;
2218     *a++ = 0x241f0000 | (alpha_rt << 21) | ((uint32_t)imm & 0xffff);
2219     *addrp = (unsigned char *) a;
2220    
2221     if (alpha_rt == ALPHA_T0) {
2222     *a++ = 0x5fff041f; /* fnop */
2223     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T0, rt);
2224     }
2225     }
2226    
2227     bintrans_write_pc_inc(addrp);
2228    
2229     return 1;
2230     }
2231    
2232    
2233     /*
2234     * bintrans_write_instruction__mfmthilo():
2235     */
2236     static int bintrans_write_instruction__mfmthilo(unsigned char **addrp,
2237     int rd, int from_flag, int hi_flag)
2238     {
2239     unsigned char *a;
2240     int ofs;
2241    
2242     a = *addrp;
2243    
2244     /*
2245     * 18 09 30 a4 ldq t0,hi(a0) (or lo)
2246     * 18 09 30 b4 stq t0,rd(a0)
2247     *
2248     * (or if from_flag is cleared then move the other way, it's
2249     * actually not rd then, but rs...)
2250     */
2251    
2252     if (from_flag) {
2253     if (rd != 0) {
2254     /* mfhi or mflo */
2255     if (hi_flag)
2256     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
2257     else
2258     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
2259     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xa4;
2260    
2261     bintrans_move_Alpha_reg_into_MIPS_reg(&a, ALPHA_T0, rd);
2262     }
2263     } else {
2264     /* mthi or mtlo */
2265     bintrans_move_MIPS_reg_into_Alpha_reg(&a, rd, ALPHA_T0);
2266    
2267     if (hi_flag)
2268     ofs = ((size_t)&dummy_cpu.cd.mips.hi) - (size_t)&dummy_cpu;
2269     else
2270     ofs = ((size_t)&dummy_cpu.cd.mips.lo) - (size_t)&dummy_cpu;
2271     *a++ = (ofs & 255); *a++ = (ofs >> 8); *a++ = 0x30; *a++ = 0xb4;
2272     }
2273    
2274     *addrp = a;
2275     bintrans_write_pc_inc(addrp);
2276     return 1;
2277     }
2278    
2279    
2280     /*
2281     * bintrans_write_instruction__mfc_mtc():
2282     */
2283     static int bintrans_write_instruction__mfc_mtc(struct memory *mem,
2284     unsigned char **addrp, int coproc_nr, int flag64bit, int rt,
2285     int rd, int mtcflag)
2286     {
2287     uint32_t *a, *jump;
2288     int ofs;
2289    
2290     /*
2291     * NOTE: Only a few registers are readable without side effects.
2292     */
2293     if (rt == 0 && !mtcflag)
2294     return 0;
2295    
2296     if (coproc_nr >= 1)
2297     return 0;
2298    
2299     if (rd == COP0_RANDOM || rd == COP0_COUNT)
2300     return 0;
2301    
2302    
2303     /*************************************************************
2304     *
2305     * TODO: Check for kernel mode, or Coproc X usability bit!
2306     *
2307     *************************************************************/
2308    
2309     a = (uint32_t *) *addrp;
2310    
2311     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2312     *a++ = 0xa4300000 | (ofs & 0xffff); /* ldq t0,coproc[0](a0) */
2313    
2314     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
2315     *a++ = 0xa4410000 | (ofs & 0xffff); /* ldq t1,reg_rd(t0) */
2316    
2317     if (mtcflag) {
2318     /* mtc: */
2319     *addrp = (unsigned char *) a;
2320     bintrans_move_MIPS_reg_into_Alpha_reg(addrp, rt, ALPHA_T0);
2321     a = (uint32_t *) *addrp;
2322    
2323     if (!flag64bit) {
2324     *a++ = 0x40201001; /* addl t0,0,t0 */
2325     *a++ = 0x40401002; /* addl t1,0,t1 */
2326     }
2327    
2328     /*
2329     * In the general case: Only allow mtc if it does NOT
2330     * change the register!!
2331     */
2332    
2333     switch (rd) {
2334     case COP0_INDEX:
2335     break;
2336    
2337     case COP0_EPC:
2338     break;
2339    
2340     /* TODO: Some bits are not writable */
2341     case COP0_ENTRYLO0:
2342     case COP0_ENTRYLO1:
2343     break;
2344    
2345     case COP0_ENTRYHI:
2346     /*
2347     * Entryhi is ok to write to, as long as the
2348     * ASID isn't changed. (That would require
2349     * cache invalidations etc. Instead of checking
2350     * for MMU3K vs others, we just assume that all the
2351     * lowest 12 bits must be the same.
2352     */
2353     /* ff 0f bf 20 lda t4,0x0fff */
2354     /* 03 00 25 44 and t0,t4,t2 */
2355     /* 04 00 45 44 and t1,t4,t3 */
2356     /* a3 05 64 40 cmpeq t2,t3,t2 */
2357     /* 01 00 60 f4 bne t2,<ok> */
2358     *a++ = 0x20bf0fff;
2359     *a++ = 0x44250003;
2360     *a++ = 0x44450004;
2361     *a++ = 0x406405a3;
2362     jump = a;
2363     *a++ = 0; /* later */
2364     *addrp = (unsigned char *) a;
2365     bintrans_write_chunkreturn_fail(addrp);
2366     a = (uint32_t *) *addrp;
2367     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2368     break;
2369    
2370     case COP0_STATUS:
2371     /* Only allow updates to the status register if
2372     the interrupt enable bits were changed, but no
2373     other bits! */
2374     if (mem->bintrans_32bit_only) {
2375     /* R3000 etc. */
2376     /* t4 = 0x0fe70000; */
2377     *a++ = 0x20bf0000;
2378     *a++ = 0x24a50fe7;
2379     } else {
2380     /* fe 00 bf 20 lda t4,0x00fe */
2381     /* ff ff a5 24 ldah t4,-1(t4) */
2382     *a++ = 0x20bf0000;
2383     *a++ = 0x24a5ffff;
2384     }
2385    
2386     /* 03 00 25 44 and t0,t4,t2 */
2387     /* 04 00 45 44 and t1,t4,t3 */
2388     /* a3 05 64 40 cmpeq t2,t3,t2 */
2389     /* 01 00 60 f4 bne t2,<ok> */
2390     *a++ = 0x44250003;
2391     *a++ = 0x44450004;
2392     *a++ = 0x406405a3;
2393     jump = a;
2394     *a++ = 0; /* later */
2395     *addrp = (unsigned char *) a;
2396     bintrans_write_chunkreturn_fail(addrp);
2397     a = (uint32_t *) *addrp;
2398     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2399    
2400     /* If enabling interrupt bits would cause an
2401     exception, then don't do it: */
2402     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2403     *a++ = 0xa4900000 | (ofs & 0xffff); /* ldq t3,coproc[0](a0) */
2404     ofs = ((size_t)&dummy_coproc.reg[COP0_CAUSE]) - (size_t)&dummy_coproc;
2405     *a++ = 0xa4a40000 | (ofs & 0xffff); /* ldq t4,reg_rd(t3) */
2406    
2407     /* 02 00 a1 44 and t4,t0,t1 */
2408     /* 83 16 41 48 srl t1,0x8,t2 */
2409     /* 04 f0 7f 44 and t2,0xff,t3 */
2410     *a++ = 0x44a10002;
2411     *a++ = 0x48411683;
2412     *a++ = 0x447ff004;
2413     /* 01 00 80 e4 beq t3,<ok> */
2414     jump = a;
2415     *a++ = 0; /* later */
2416     *addrp = (unsigned char *) a;
2417     bintrans_write_chunkreturn_fail(addrp);
2418     a = (uint32_t *) *addrp;
2419     *jump = 0xe4800000 | (((size_t)a - (size_t)jump - 4) / 4);
2420     break;
2421    
2422     default:
2423     /* a3 05 22 40 cmpeq t0,t1,t2 */
2424     /* 01 00 60 f4 bne t2,<ok> */
2425     *a++ = 0x402205a3;
2426     jump = a;
2427     *a++ = 0; /* later */
2428     *addrp = (unsigned char *) a;
2429     bintrans_write_chunkreturn_fail(addrp);
2430     a = (uint32_t *) *addrp;
2431     *jump = 0xf4600000 | (((size_t)a - (size_t)jump - 4) / 4);
2432     }
2433    
2434     *a++ = 0x40201402; /* addq t0,0,t1 */
2435    
2436     ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
2437     *a++ = 0xa4300000 | (ofs & 0xffff); /* ldq t0,coproc[0](a0) */
2438     ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
2439     *a++ = 0xb4410000 | (ofs & 0xffff); /* stq t1,reg_rd(t0) */
2440     } else {
2441     /* mfc: */
2442     if (!flag64bit) {
2443     *a++ = 0x40401002; /* addl t1,0,t1 */
2444     }
2445    
2446     *addrp = (unsigned char *) a;
2447     bintrans_move_Alpha_reg_into_MIPS_reg(addrp, ALPHA_T1, rt);
2448     a = (uint32_t *) *addrp;
2449     }
2450    
2451     *addrp = (unsigned char *) a;
2452    
2453     bintrans_write_pc_inc(addrp);
2454     return 1;
2455     }
2456    
2457    
2458     /*
2459     * bintrans_write_instruction__tlb_rfe_etc():
2460     */
2461     static int bintrans_write_instruction__tlb_rfe_etc(unsigned char **addrp,
2462     int itype)
2463     {
2464     uint32_t *a;
2465     int ofs = 0;
2466    
2467     switch (itype) {
2468     case CALL_TLBWI:
2469     case CALL_TLBWR:
2470     case CALL_TLBP:
2471     case CALL_TLBR:
2472     case CALL_RFE:
2473     case CALL_ERET:
2474     case CALL_BREAK:
2475     case CALL_SYSCALL:
2476     break;
2477     default:
2478     return 0;
2479     }
2480    
2481     a = (uint32_t *) *addrp;
2482    
2483     /* a0 = pointer to the cpu struct */
2484    
2485     switch (itype) {
2486     case CALL_TLBWI:
2487     case CALL_TLBWR:
2488     /* a1 = 0 for indexed, 1 for random */
2489     *a++ = 0x223f0000 | (itype == CALL_TLBWR);
2490     break;
2491     case CALL_TLBP:
2492     case CALL_TLBR:
2493     /* a1 = 0 for probe, 1 for read */
2494     *a++ = 0x223f0000 | (itype == CALL_TLBR);
2495     break;
2496     case CALL_BREAK:
2497     case CALL_SYSCALL:
2498     *a++ = 0x223f0000 | (itype == CALL_BREAK? EXCEPTION_BP : EXCEPTION_SYS);
2499     break;
2500     }
2501    
2502     /* Put PC into the cpu struct (both pc and pc_last). */
2503     *a++ = 0xb4d00000 | ofs_pc; /* stq t5,"pc"(a0) */
2504     *a++ = 0xb4d00000 | ofs_pc_last;/* stq t5,"pc_last"(a0) */
2505    
2506     /* Save a0 and the old return address on the stack: */
2507     *a++ = 0x23deff80; /* lda sp,-128(sp) */
2508    
2509     *a++ = 0xb75e0000; /* stq ra,0(sp) */
2510     *a++ = 0xb61e0008; /* stq a0,8(sp) */
2511     *a++ = 0xb0fe0018; /* stl t6,24(sp) */
2512     *a++ = 0xb71e0020; /* stq t10,32(sp) */
2513     *a++ = 0xb73e0028; /* stq t11,40(sp) */
2514     *a++ = 0xb51e0030; /* stq t7,48(sp) */
2515     *a++ = 0xb6de0038; /* stq t8,56(sp) */
2516     *a++ = 0xb6fe0040; /* stq t9,64(sp) */
2517    
2518     switch (itype) {
2519     case CALL_TLBP:
2520     case CALL_TLBR:
2521     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbpr) - (size_t)&dummy_cpu;
2522     break;
2523     case CALL_TLBWR:
2524     case CALL_TLBWI:
2525     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbwri) - (size_t)&dummy_cpu;
2526     break;
2527     case CALL_RFE:
2528     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_rfe) - (size_t)&dummy_cpu;
2529     break;
2530     case CALL_ERET:
2531     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_eret) - (size_t)&dummy_cpu;
2532     break;
2533     case CALL_BREAK:
2534     case CALL_SYSCALL:
2535     ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_simple_exception) - (size_t)&dummy_cpu;
2536     break;
2537     }
2538    
2539     *a++ = 0xa7700000 | ofs; /* ldq t12,0(a0) */
2540    
2541     /* Call bintrans_fast_tlbwr: */
2542     *a++ = 0x6b5b4000; /* jsr ra,(t12),<after> */
2543    
2544     /* Restore the old return address and a0 from the stack: */
2545     *a++ = 0xa75e0000; /* ldq ra,0(sp) */
2546     *a++ = 0xa61e0008; /* ldq a0,8(sp) */
2547     *a++ = 0xa0fe0018; /* ldl t6,24(sp) */
2548     *a++ = 0xa71e0020; /* ldq t10,32(sp) */
2549     *a++ = 0xa73e0028; /* ldq t11,40(sp) */
2550     *a++ = 0xa51e0030; /* ldq t7,48(sp) */
2551     *a++ = 0xa6de0038; /* ldq t8,56(sp) */
2552     *a++ = 0xa6fe0040; /* ldq t9,64(sp) */
2553    
2554     *a++ = 0x23de0080; /* lda sp,128(sp) */
2555    
2556     /* Load PC from the cpu struct. */
2557     *a++ = 0xa4d00000 | ofs_pc; /* ldq t5,"pc"(a0) */
2558    
2559     *addrp = (unsigned char *) a;
2560    
2561     switch (itype) {
2562     case CALL_ERET:
2563     case CALL_BREAK:
2564     case CALL_SYSCALL:
2565     break;
2566     default:
2567     bintrans_write_pc_inc(addrp);
2568     }
2569    
2570     return 1;
2571     }
2572    
2573    
2574     /*
2575     * bintrans_backend_init():
2576     *
2577     * This is neccessary for broken 2.95.4 compilers on FreeBSD/Alpha 4.9,
2578     * and probably a few others. (For Compaq's CC, and for gcc 3.x, this
2579     * wouldn't be neccessary, and the old code would have worked.)
2580     */
2581     static void bintrans_backend_init(void)
2582     {
2583     int size;
2584     uint32_t *p;
2585    
2586    
2587     /* "runchunk": */
2588     size = 256; /* NOTE: This MUST be enough, or we fail */
2589     p = (uint32_t *)mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_EXEC,
2590     MAP_ANON | MAP_PRIVATE, -1, 0);
2591    
2592     /* If mmap() failed, try malloc(): */
2593     if (p == NULL) {
2594     p = malloc(size);
2595     if (p == NULL) {
2596     fprintf(stderr, "bintrans_backend_init(): out of memory\n");
2597     exit(1);
2598     }
2599     }
2600    
2601     bintrans_runchunk = (void *)p;
2602    
2603     *p++ = 0x23deffa0; /* lda sp,-0x60(sp) */
2604     *p++ = 0xb75e0000; /* stq ra,0(sp) */
2605     *p++ = 0xb53e0008; /* stq s0,8(sp) */
2606     *p++ = 0xb55e0010; /* stq s1,16(sp) */
2607     *p++ = 0xb57e0018; /* stq s2,24(sp) */
2608     *p++ = 0xb59e0020; /* stq s3,32(sp) */
2609     *p++ = 0xb5be0028; /* stq s4,40(sp) */
2610     *p++ = 0xb5de0030; /* stq s5,48(sp) */
2611     *p++ = 0xb5fe0038; /* stq s6,56(sp) */
2612     *p++ = 0xb7be0058; /* stq gp,0x58(sp) */
2613    
2614     *p++ = 0xa4d00000 | ofs_pc; /* ldq t5,"pc"(a0) */
2615     *p++ = 0xa0f00000 | ofs_n; /* ldl t6,"bintrans_instructions_executed"(a0) */
2616     *p++ = 0xa5100000 | ofs_a0; /* ldq t7,"a0"(a0) */
2617     *p++ = 0xa6d00000 | ofs_a1; /* ldq t8,"a1"(a0) */
2618     *p++ = 0xa6f00000 | ofs_s0; /* ldq t9,"s0"(a0) */
2619     *p++ = 0xa1300000 | ofs_ds; /* ldl s0,"delay_slot"(a0) */
2620     *p++ = 0xa5500000 | ofs_ja; /* ldq s1,"delay_jmpaddr"(a0) */
2621     *p++ = 0xa5700000 | ofs_sp; /* ldq s2,"gpr[sp]"(a0) */
2622     *p++ = 0xa5900000 | ofs_ra; /* ldq s3,"gpr[ra]"(a0) */
2623     *p++ = 0xa5b00000 | ofs_t0; /* ldq s4,"gpr[t0]"(a0) */
2624     *p++ = 0xa5d00000 | ofs_t1; /* ldq s5,"gpr[t1]"(a0) */
2625     *p++ = 0xa5f00000 | ofs_t2; /* ldq s6,"gpr[t2]"(a0) */
2626     *p++ = 0xa7100000 | ofs_tbl0; /* ldq t10,table0(a0) */
2627     *p++ = 0xa7300000 | ofs_v0; /* ldq t11,"gpr[v0]"(a0) */
2628    
2629     *p++ = 0x6b514000; /* jsr ra,(a1),<back> */
2630    
2631     *p++ = 0xb4d00000 | ofs_pc; /* stq t5,"pc"(a0) */
2632     *p++ = 0xb0f00000 | ofs_n; /* stl t6,"bintrans_instructions_executed"(a0) */
2633     *p++ = 0xb5100000 | ofs_a0; /* stq t7,"a0"(a0) */
2634     *p++ = 0xb6d00000 | ofs_a1; /* stq t8,"a1"(a0) */
2635     *p++ = 0xb6f00000 | ofs_s0; /* stq t9,"s0"(a0) */
2636     *p++ = 0xb1300000 | ofs_ds; /* stl s0,"delay_slot"(a0) */
2637     *p++ = 0xb5500000 | ofs_ja; /* stq s1,"delay_jmpaddr"(a0) */
2638     *p++ = 0xb5700000 | ofs_sp; /* stq s2,"gpr[sp]"(a0) */
2639     *p++ = 0xb5900000 | ofs_ra; /* stq s3,"gpr[ra]"(a0) */
2640     *p++ = 0xb5b00000 | ofs_t0; /* stq s4,"gpr[t0]"(a0) */
2641     *p++ = 0xb5d00000 | ofs_t1; /* stq s5,"gpr[t1]"(a0) */
2642     *p++ = 0xb5f00000 | ofs_t2; /* stq s6,"gpr[t2]"(a0) */
2643     *p++ = 0xb7300000 | ofs_v0; /* stq t11,"gpr[v0]"(a0) */
2644    
2645     *p++ = 0xa75e0000; /* ldq ra,0(sp) */
2646     *p++ = 0xa53e0008; /* ldq s0,8(sp) */
2647     *p++ = 0xa55e0010; /* ldq s1,16(sp) */
2648     *p++ = 0xa57e0018; /* ldq s2,24(sp) */
2649     *p++ = 0xa59e0020; /* ldq s3,32(sp) */
2650     *p++ = 0xa5be0028; /* ldq s4,40(sp) */
2651     *p++ = 0xa5de0030; /* ldq s5,48(sp) */
2652     *p++ = 0xa5fe0038; /* ldq s6,56(sp) */
2653     *p++ = 0xa7be0058; /* ldq gp,0x58(sp) */
2654     *p++ = 0x23de0060; /* lda sp,0x60(sp) */
2655     *p++ = 0x6bfa8001; /* ret */
2656    
2657    
2658     /* "jump to 32bit pc": */
2659     size = 128; /* WARNING! Don't make this too small. */
2660     p = (uint32_t *)mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_EXEC,
2661     MAP_ANON | MAP_PRIVATE, -1, 0);
2662    
2663     /* If mmap() failed, try malloc(): */
2664     if (p == NULL) {
2665     p = malloc(size);
2666     if (p == NULL) {
2667     fprintf(stderr, "bintrans_backend_init(): out of memory\n");
2668     exit(1);
2669     }
2670     }
2671    
2672     bintrans_jump_to_32bit_pc = (void *)p;
2673    
2674     /* Don't execute too many instructions: */
2675     *p++ = 0x205f0000 | (N_SAFE_BINTRANS_LIMIT-1); /* lda t1,safe-1 */
2676    
2677     *p++ = 0x40e20da1; /* cmple t6,t1,t0 */
2678     *p++ = 0xf4200001; /* bne */
2679     *p++ = 0x6bfa8001; /* ret */
2680    
2681     *p++ = 0x40c01411; /* addq t5,0,a1 */
2682    
2683     /*
2684     * Special case for 32-bit addressing:
2685     *
2686     * t1 = 1023;
2687     * t2 = ((a1 >> 22) & t1) * sizeof(void *);
2688     * t3 = ((a1 >> 12) & t1) * sizeof(void *);
2689     * t1 = a1 & 4095;
2690     */
2691     *p++ = 0x205f1ff8; /* lda t1,1023 * 8 */
2692     *p++ = 0x4a227683; /* srl a1,19,t2 */
2693     *p++ = 0x4a213684; /* srl a1, 9,t3 */
2694     *p++ = 0x44620003; /* and t2,t1,t2 */
2695    
2696     /*
2697     * t10 is vaddr_to_hostaddr_table0
2698     *
2699     * a3 = tbl0[t2] (load entry from tbl0)
2700     */
2701     *p++ = 0x43030412; /* addq t10,t2,a2 */
2702     *p++ = 0x44820004; /* and t3,t1,t3 */
2703     *p++ = 0xa6720000; /* ldq a3,0(a2) */
2704     *p++ = 0x205f0ffc; /* lda t1,0xffc */
2705    
2706     /*
2707     * a3 = tbl1[t3] (load entry from tbl1 (whic is a3))
2708     */
2709     *p++ = 0x42640413; /* addq a3,t3,a3 */
2710     *p++ = 0x46220002; /* and a1,t1,t1 */
2711    
2712     *p++ = 0xa6730000 | ofs_c0; /* ldq a3,chunks[0](a3) */
2713    
2714     /*
2715     * NULL? Then just return.
2716     */
2717     *p++ = 0xf6600001; /* bne a3,<ok> */
2718     *p++ = 0x6bfa8001; /* ret */
2719    
2720     *p++ = 0x40530402; /* addq t1,a3,t1 */
2721     *p++ = 0xa0220000; /* ldl t0,0(t1) */
2722    
2723     /* No translation? Then return. */
2724     *p++ = 0xe4200003; /* beq t0,<skip> */
2725    
2726     *p++ = 0xa4700000 | ofs_cb; /* ldq t2,chunk_base_address(a0) */
2727    
2728     *p++ = 0x40230401; /* addq t0,t2,t0 */
2729     *p++ = 0x6be10000; /* jmp (t0) */
2730    
2731     /* Return to the main translation loop. */
2732     *p++ = 0x6bfa8001; /* ret */
2733     }
2734    

  ViewVC Help
Powered by ViewVC 1.1.26