/[gxemul]/trunk/src/bintrans_i386.c
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/src/bintrans_i386.c

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2 - (show annotations)
Mon Oct 8 16:17:48 2007 UTC (16 years, 5 months ago) by dpavlin
File MIME type: text/plain
File size: 83884 byte(s)
++ trunk/HISTORY	(local)
$Id: HISTORY,v 1.676 2005/04/07 15:14:55 debug Exp $

Changelog for GXemul:
---------------------

20030829	Skeleton. ELF stuff. Some instructions.
20030830	Simple test programs using +-*/^|&%, function calls,
		loops, and stuff like that work.
20030903	Adding more instructions, fixing some bugs.
20030907	Making adding of memory mapped devices easier, although
		the framework isn't built for speed.
		Adding a -q switch to turn of debug output.
20030911	Trying to fix some bugs. Minor changes. Some COP0
		registers are now meaningful.
20030919	Making APs (non-bootstrap cpus) available via a simple
		'mp' device. Implementing ll/lld and sc/scd (for atomic
		memory updates, needed by MP operating systems).
20030923	Minor updates: more instructions (divu, mulu, lwu,
		perhaps some more), and opcode usage statistics.
20030924	If the next instruction is nullified (for 'branch
		likely' type of instructions), counters for delays etc
		are now decreased as they should.
		Adding some comments.
		Adding instructions: movz, movn.
		Adding a simple mandelbrot test to mipstest.c.
20030925	Adding instructions: bltzl, bgezl, lh, lhu, sh, mfc*,
		mtc*.
		Adding a dummy instructions: sync, cache.
		Adding minimal DECstation PROM functionality: printf()
		and getsysid() callback functions.
		Beginning work on address translation.
20030927	Adding some more cop0 functionality (tlb stuff).
		Adding mc146818 real-time clock. (Skeleton stuff.)
20030928	Adding a dc7085 serial console device (dummy, but enough
		to output chars to the screen). NetBSD uses this for
		the MIPSMATE 5100.
20030929	Working on the TLB stuff.
		Adding instructions: srlv, tlbwr, tlbr, tlbp, eret.
20030930	Trying to find a bug which causes NetBSD to bug out, but
		it is really hard.
		Adding some a.out support (for loading an old
		OpenBSD 2.8/pmax kernel image).
		Adding instructions: lwc*, ldc*, swc1 and swc3.
		Beginning to add special code to handle the differences
		between R4000 (the default emulation) and R2000/R3000.
20031001	Symbol listings produced by 'nm -S' can be used to
		show symbolic names for addresses. (-S)
20031002	Fixing the i/d fake cache for R2000/R3000. It's still
		just an ugly hack, though.
		Fixing minor bugs to make the 3100 emulation use the
		dc device (serial console) correctly. So far, 5100 and
		3100 are the only ones that get far enough to print
		stuff, when booting NetBSD.
20031004	Adding skeleton Cobalt machine emulation (-E).
		Adding a dummy ns16550 serial controller, used by the
		Cobalt machine emulation.
20031006	Adding unaligned load/store instructions (lwl, lwr,
		ldl, ldr, swl, swr, sdl, sdr), although they are not
		tested yet.
		Fixed a "data modified on freelist" bug when running
		NetBSD/cobalt: setting the top bit of the index register
		when a tlbp fails (as the R4000 manual says) isn't
		sufficient, I had to clear the low bits as well.
		Adding break and syscall instructions, but they are not
		tested yet.
		Adding a 'gt' device, faking a PCI bus, for the Cobalt
		emulation.
20031008	Adding initial support for HPCmips (-F), a framebuffer
		device using X11. NetBSD/hpcmips can output pixels to
		the framebuffer, but that's about it.
20031009	Fixing the NetBSD/pmax bug: the "0/tftp/netbsd" style
		bootstring was only passed correctly in the bootinfo
		block, it needs to be passed as argv[0] as well.
		Adding instructions: mtlo, mthi.
		Rearrangning the source tree layout.
		Adding console input functionality. The NetBSD/cobalt
		kernel's ddb can now be interacted with.
20031010	Adding experimental (semi-useless) -t option, to show
		a function call tree while a program runs.
		Linux/cobalt now prints a few messages, but then hangs
		at "Calibrating delay loop..." unless an ugly hack is
		used (setting a word of memory at 0x801e472c to non-zero).
20031013	Adding a framebuffer device used in DECstation 3100;
		VFB01 for mono is implemented so far, not yet the
		VFB02 (color) variant.  Rewriting the framebuffer
		device so that it is usable by both HPCmips and DECstation
		emulation.
20031014	Minor fixes. Everything should compile and run ok
		both with and without X11.
20031015	Adding support for ECOFF binary images; text, data,
		and symbols are loaded. (Playing around with ultrixboot
		and ultrix kernels.)
20031016	The DECstation argv,argc stuff must be at 0xa0000000,
		not 0x80000000, or Ultrix kernels complain.
		Adding R2000/R3000 'rfe' instruction.
		Implementing more R2K/R3K tlb specific stuff, so that
		NetBSD boots and uses the tlb correctly, but much of
		it is ugly. (Needs to be separated in a cleaner way.)
		ECOFF symbols sizes are now calculated, so that offsets
		within symbols are usable.
20031017	DECstation bootstrings now automatically include the
		correct name of the kernel that is booting.
		Ultrix boots a bit.
20031018	ELF symbols are now read automatically from the binary.
		-t trace looks a bit better (string arguments are shown).
		Trying to get initial R5900 stuff working (the 128-bit
		CPU used in Playstation 2).
		Fixing a minor bug to make the VFB02 (color framebuffer)
		device work better, but it is still just 256 grayscales,
		not real color. Ultrix can now use the framebuffer (it
		calls it PMAX-CFB).
		A machine can now consist of CPUs of different types.
		Adding instructions: daddi, mov_xxx, mult_xx. The xxx
		instructions are not documented MIPS64 instructions,
		but NetBSD/playstation2 uses them. Perhaps VR5432
		instructions?
		Adding sign-extension to 32-bit mult.
		Adding Playstation 2 devices: dmac (DMA controller),
		gs (Graphic something?), and gif (graphics something
		else, which has access to the PS2's framebuffer).
		NetBSD/playstation2 works a bit, and prints a few
		bootup messages.
20031020	The cpu_type field of the cpu struct now contains
		usable values in a much better form than before. This
		simplifies adding of new CPU types.
20031021	Fixing an interrupt related bug: pc_last was used, but
		for interrupts this was incorrect. Fixed now.
		Fixing a load/store related bug: if a load into a
		register was aborted due to an exception, the register
		was still modified.
		The mc146818 rtc now reads its time from the system's
		time() function.
		Fixing another exception bug: if loading an instruction
		caused an exception, something bogus happened as the
		emulator tried to execute the instruction anyway. This
		has been fixed now.
20031023	Adding a quick hack which skips "while (reg --) ;"
		kind of loops.
		NetBSD/pmax suddenly reached userland (!), but only
		once and attempts to repeat it have failed. I believe
		it is problems with my interrupt handling system.
20031024	Adding 8-bit color palette support to the framebuffer.
		Connecting the pmax vdac device to the framebuffer's
		rgb palette.
		Fixing a bug in the dc device, so that console input
		is possible; interaction with NetBSD/pmax's built-in
		kernel debugger works now.
		Symbol sizes for file formats where symbol size isn't
		included are now calculated regardless of file format.
		Physical memory space can now be smaller than 64 bits,
		improving emulation speed a bit.
		Doing other minor performance enhancements by moving
		around some statements in critical parts of the code.
20031025	Minor changes to the dc device.
20031026	Adding support for reading symbols directly from
		a.out files. (Works with OpenBSD/pmax binaries.)
		Hardware devices may now register "tick functions" at
		specific cycle intervals in a generic fashion.
		All four channels of the dc serial controller device
		should now work; playing around with keyboard scan
		code generation when using the DECstation framebuffer.
		Making various (speed) improvements to the framebuffer
		device.
20031027	Playing around with the sii SCSI controller.
20031028	Minor fixes.
		Adding an SGI emulation mode (-G), and some ARCBIOS
		stuff, which SGIs seem to use.
		Adding getbitmap() to the DEC prom emulation layer,
		so some more -D x models become more usable.
		Adding a dummy 'ssc' serial console device for
		DECsystem 5400 emulation.
		Playing around with TURBOchannel stuff.
20031030	Minor fixes.
		Adding the sub instruction. (Not tested yet?)
		Sign-extending the results of multu, addi,addiu,
		add,addu,sub,subu,mfcZ.
		Adding a colorplanemask device for DECstation 3100.
		Fixed the NetBSD/pmax bug: I had forgotten to reset
		asid_match to 0 between tlb entry checks. :-)  Now
		userland runs nicely...
20031031	Fixing more bugs:  unaligned load/store could fail
		because of an exception, but registers could be "half
		updated". This has been fixed now.  (As a result,
		NetBSD/pmax can now run with any of r2000,r3000,r4000,
		r4400, or r5000.)
		Adding some R5K and R10000 stuff.  (Note: R5K is NOT
		R5000. Weird.)
		Adding dummy serial console (scc) for MAXINE.
		MAXINE also works with framebuffer, but there is no
		color palette yet (only black and white output).
20031101	Moving code chunks around to increase performance by
		a few percent.
		The opcode statistics option (-s) now shows opcode
		names, and not just numbers. :-)
		Fixing the bug which caused NetBSD/pmax to refuse
		input in serial console mode, but not in keyboard/
		framebuffer mode: the osconsole environment variable
		wasn't set correctly.
		Adding DEC PROM getchar() call.
		The transmitter scanner of the dc device now scans
		all four channels at once, for each tick, so serial
		output is (approximately) 4 times faster.
20031103	Adding a dummy BT459 vdac device, which does nothing
		but allows a PMAG-BA turbochannel graphics card to be
		used as framebuffer.
		Several DECstation machines (-D 2, 3, and 4) can now
		use TURBOchannel option card framebuffers as console,
		for output. (Keyboard input is still not implemented
		for those models.)  Only PMAG-AA (1280x1024x8) and
		PMAG-BA (1024x864x8), both using BT459 vdac, have
		been tested so far.
		Modifying the X11 routines so that several framebuffer
		windows now can be used simultaneously (if several
		graphics option cards are to be emulated concurrently).
20031104	DEC MIPSMATE 5100 (KN230) interrupts are shared
		between devices. I've added an ugly hack to allow
		that to work, which makes it possible to boot NetBSD
		into userland with serial console.
20031106	Removing the -S (symbol) option, as symbol files can
		now be given in any order together with other file
		names to be loaded.
		cookin tipped me about using (int64_t) (int32_t)
		casts instead of manually sign-extending values.
		Casting sometimes increases performance, sometimes
		decreases. It's tricky.
		Importing mips64emul into CVS.
20031107	Adding a generic ARC emulation mode.
		Increasing performance of the framebuffer by not
		updating it (or the XImage) if a write to the
		framebuffer contains exactly what is already in it.
		(This improves scrolling speed and initialization.)
		Adding initial MIPS16 support.
		Adding initial disk image support (-d command line
		option), but this will not be used until I get some
		kind of SCSI-controller emulation working.
20031108	Adding the first MIPS16 instructions: "move y,X",
		"ld y,D(x)", and "daddiu S,K" (but the last one
		doesn't work yet).
		Fixing the console environment variable for
		Enough of the 'asc' controller is now implemented
		to let NetBSD get past scsi disk detection when
		no disk images are used.
		DECstation machine type 2; both serial console and
		graphical console work now.
		Other X-windows bit-depths than 24 bits work now,
		but colors are still not correct in non-24 bit modes.
		Keypresses in X framebuffer windows are now
		translated into console keypresses. (Normal keys, but
		not cursor keys or other special keys.)
20031111	Adding support for X11 using non-24-bit output.
20031120	Adding X11 mouse event to emulated mouse event
		translation, but it's not tested yet.
		Trying to get more of the SCSI controller emulation
		to work.
20031124	Raw binaries can now be loaded into memory.
20031204	Adding srec binary support.
20031220	Adding some super-ugly arcbios emulation code.
		Making some progress on the SGI and ARC machine
		emulations.
20031222	SGI and ARC progress. Multiple CPUs are now added to
		the arcbios component tree (although NetBSD cannot
		actually use more than one).
20031228	Adding 'crime' and 'macepci' fake devices for SGI
		emulation.
		Finally implementing the cop0 'compare' register.
		Improvements to the ns16550 device, but it is still
		incomplete.
		SGI userland is now reached, but interaction is broken
		(due to the buggy ns16550).
20031229	Adding some more instructions: teq, dsllv
		Adding a Nintendo 64 emulation mode (skeleton).
		Adding R4300 and R12000 to the cpu list.
20031230	Adding bltzal, bltzall, bgezal, bgezall (not really
		tested yet).
		Fixing the 16550 serial controller device (by not
		supporting fifo, so in fact it emulates a 16450
		instead).  This causes NetBSD/sgimips to run nicely
		into userland, sysinst, and so on.
		Some ARC/RD94 interrupts seem to work ok now, but
		i/o interrupts are still not correctly implemented.
		NetBSD/arc userland is reached and can be interacted
		with, but there's no sysinst (?).
20040103	Trying to get some Irix stuff to work, but it's hard.
		Fixing some Cobalt/linux problems.
20040104	Adding a dummy 8250 device, so that Linux/sgimips can output
		console messages.
		Adding dmultu. (The same as dmult, so I'm not sure it's correct.
		Perhaps dmultu is correct and dmult is wrong...)
		Fixing a bug in unaligned load/stores of 64-bit values (a cast
		was needed).
		Linux/sgimips in 64-bit works a bit more than before.
		Adding simple (polled) input functionality to dev_zs.
		Making some progress on SGI-IP22 (IP32 still works best,
		though).
		Fixing the mc146818 clock device in ARC/NEC and SGI emulation
		modes, the year field was not correct.
		Adding a fake 'pref' instruction (lwc3).
20040106	Separating out memory.h from misc.h.
		Refactoring of a lot of small code fragments.
		The PCI bus device is now shared between Cobalt, SGI, and ARC.
		Support for RAM mirroring (dev_ram.c, not really tested yet).
		Ugly hack to select the largest of ELF string symbol tables,
		if there are more than one.
		Memory hole fix for ARCBIOS, and a fix for very large (>= 4GB)
		amounts of emulated RAM.
		TGA (DEC 21030) PCI graphics device. NetBSD/arc can boot with
		this card and use it as a framebuffer console.
20040107	Adding a fix (partly incorrect) to daddi, to allow Linux/sgimips
		to boot in 64-bit mode.
20040108	Fixing a sll/nop bug (rd==0 for nop, not sa==0 as before).
20040109	Trying to get an SGI-IP32 PROM image to boot.
20040110	Faking R10000 cache things.
		The PROM image boots, although it takes almost forever for it
		to realize that there is no keyboard.
		The 'gbe' SGI-IP32 graphics device works enough to display the
		Linux framebuffer penguin in the upper left corner :-)
20040111	-p and -P addresses can now be given as symbol names, not just
		numeric values.
		Experimenting with adding a PCIIDE (dev_wdc) controller to the
		Cobalt emulation.
20040120	Adding src/bintrans.c. No code yet, but this is a place for
		ideas to be written down.
		Increasing performance a little bit by inlining the check for
		interrupts (which occurs for every instruction).
20040124	Experimenting with pure userland (syscall) emulation.
20040127	Fixes for compiling under Solaris.
20040206	Some bintrans experiments.
20040209	Adding some simple Ultrix userland emulation syscalls.
20040211	Adding decprom_dump_txt_to_bin.c to the experiments/ dir.
		Adding a section to doc/ on how to use DECstation PROM dumps.
		Adding a hello world example to doc/ as well.
20040218	TURBOchannel slots that are empty now return a DBE exception,
		so that Ultrix and DECstation PROMs don't complain about
		broken TURBOchannel ROMs.
		Working some more on the machine-dependant interrupt stuff.
20040219	Trying out some Linux/DECstation kernels (semi-successfully).
20040222	YES! I finally found the bug that caused Linux/SGI-IP32 to only
		work on Alpha, not on 32-bit machines.  It was a shift left,
		probably done using 6 bits on alpha, 5 bits on 32-bit machines.
20040223	Some minimal DEC KN5800 progress; Ultrix prints some boot
		messages, detects 16 XMI R3000 cpus, and get a NULL panic.
		It's all fake, though, the CPUs don't actually work.
		Still, better than nothing :-)
20040225	An Ultrix OSF1 kernel with a ramdisk now boots :-)  (It was
		a problem with ultrixboot not giving the same arguments as
		NetBSD's boot program.)
20040225(later)	Fixing a bug in the DECstation dc serial device; digits 0-9
		were translated to numeric keypad 0-9, not the normal 0-9.
		(This caused Ultrix to print escape sequences instead of
		digits.)
20040226	Some progress on machine-dependant interrupt delivery
		for -D7 (Maxine) and -D4, and some more 'scc' serial
		controller featuers are implemented (but no interrupts/
		dma/keyboard/mouse stuff yet).
20040228	Progress on the scc controller; -D4 works in both serial
		console mode and with keyboard (graphical console), but no
		mouse yet.
20040301	SGI mace interrupts are now done using the new machine-
		independant interrupt system.
20040303	Fixing an R5900 bug; the lowest 6 bits have special meaning
		for coprocessor functions, not just 5 bits as on non-R5900
		CPUs. (This fixes a bug which caused NetBSD to crash.)
20040304	Adding enough (fake) DMA capabilities to the ioasic device
		to allow Ultrix to print boot messages in the -D3, -D4,
		and -D7 modes, and also print graphical console messages
		in -D4 and -D7 modes.
		-D11 (DEC5500) polled getchar added (to the 'ssc' device).
		Adding the 'madd' instruction (including R5900 weird stuff).
20040304(later)	Playstation 2's GIF can now copy 640x16 pixel chunks, allowing
		NetBSD to scroll up the framebuffer.  The cursor also works
		better now.
		Playstation 2 bootinfo RTC data should now be passed correctly
		to the running kernel.
		DECstation rtc year should be either 72 or 73, anything else
		will cause Ultrix to give a warning about invalid year.
20040306	Combining playstation2's dmac, interrupt, and timer devices
		into one (ps2_stuff).
		Adding some R5900 instructions: mfsa, mtsa, pmfhi, pmflo, por,
		lq, and sq.  (Most of them are just guesses, though.)
		Implementing my own XImage putpixel routine, which can be
		inlined... significantly faster than normal XPutPixel. :-)
20040307	Implementing the basic functionality of a "PMAG-CA" pixelstamp
		accellerated framebuffer device. Works with NetBSD and
		Ultrix, but no cursor or color support.
20040308	PMAG-CA, -DA, and -FA pixelstamps seem to work now.
		Adding a hack to allow a pmax/mach kernel to be loaded (it's
		a COFF file with 0 (!) sections).
		Initial test of bt459 + framebuffer cursor support.
20040309	Fixes/updates of dev_dec5800 and dev_ssc (and dev_decxmi) allow
		a KN5800 Ultrix-OSF1-ramdisk kernel to boot all the way into
		userland and be interacted with.
		The bt459 cursor should now look semi-nice, but it is still
		a bit fake.
20040310	Moving the DEC CCA stuff from src/machine.c into a separate
		device file (devices/dev_deccca.c).
		An ugly hack added to allow some more OSF/1 kernels (almost
		a.out, but without many of the header fields) to load.
20040314	Adding PMAG-JA and PMAG-RO (1280x1024 x 8-bit) TURBOchannel
		graphics devices. They work in Ultrix, but only monochrome
		and no cursor, because there are no ramdacs or such yet.
20040315	Pixelstamp solid fill now supports colors other than just
		zero-fill.
		Adding a (new) regression test skeleton.
20040321	Some really minor updates.
20040323	Fixes to allow SGI-IP20 and IP22 to work a bit better
		(aliased memory), and adding "private" firmware-like vectors
		to arcbios emul. An IP22 Irix kernel gets far enough to
		print an assertion warning (and then double panics). :-)
20040324	Adding a generalization hack to the SCC serial controller
		to work with SGI-IP19 (in addition to DECstations).
		Adding the 'sdc1' instruction.
		Some progress on various SGI emulation modes.
20040325	Minor updates.
20040326	Fixed a 'madd' bug (r5900). NetBSD/playstation2 now reaches
		userland correctly.  And a simple fix which allows NetBSD
		timer interrupts to be triggered; NetBSD uses T_MODE_CMPE
		(compare), while Linux uses _OVFE (overflow).
20040328	Linux on Playstation 2 boots a bit. The Playstation 2
		graphics controller has been extended to work better with
		NetBSD, and to include some Linux support as well.
		Some interrupt handling enhancements on Playstation 2,
		needed for Linux' dma.
		128-bit loads and stores (lq and sq) are allowed, although
		the top half of quadwords are not modified by other
		instructions. (Linux uses lq and sq.)
		Big-endian X Windows servers now display correct rgb color,
		not bgr as before.
20040330	Some minor updates to the documentation.
20040401	Adding a dummy ps2 OHCI device.
20040402	Progress on the asc SCSI controller.
20040406	Hack to allow ./configure, make to work on HP-UX B.11.00
		on HPPA-RISC, gcc 3.3.2. (Does not work with HP's cc.)
		More progress on the asc SCSI controller. Fixing INQUIRY,
		adding READ_CAPACITY, adding READ. Works a bit with NetBSD
		and some (but not all) Ultrix kernels, on DECstation type 2.
		Adding WRITE, SYNCRONIZE_CACHE.
		Mounting disks works in NetBSD :-)  It is a bit buggy,
		though. Or something else is buggy.
20040407	The bug is triggered by gunzip during NetBSD/pmax install.
20040408	Fixing a bug (non-nul-terminated string) which caused X11
		cursors to not display on Solaris.
		Unnecessary X11 redraws are skipped (removes some weird
		delays that existed before), and cursors are redrawn on
		window exposure. (The cursor functionality has been moved
		from dev_fb.c to x11.c.)
20040411	Fixing the DC7085 device so that Ultrix doesn't behave weird
		if both tx and rx interrupts occur at the same time.
		More advancements on the asc SCSI controller.
		More disk image filename prefixes are now recognized; c (for
		CD-ROM, as before), d for disk, b for boot device, r for
		read-only, and 0-7 for scsi id.
		Mounting disks works in Ultrix. Installing to disk usually
		crashes for various reasons, but an OSF/1 install gets
		relatively far (similar to the NetBSD/pmax install).
20040412	Trying to find the bug.
20040415	Finally found and fixed the bug; SCSI reads and writes
		(actually, any data in or data out) can be split up into
		multiple DMA transfers. That stuff was only partially
		implemented, and the part that was implemented was buggy.
		It works now. NetBSD/pmax and Ultrix 4.3 seems to like
		the SCSI stuff enough to install almost all the way.
20040415 (more)	Adding a hack which allows a host's cdrom device to be used as
		a cdrom device inside the emulator, eg /dev/cd0c.
		Making the cycle counter int64_t instead of long, as a 'long'
		overflows too easily on 32-bit machines. (The bug is still
		there, though.)
		I've now verified that a full NetBSD/pmax install can be done.
		If using a PMAG-AA graphics board, startx brings up X :-)
		mips64emul can be compiled inside NetBSD inside mips64emul,
		and it can run NetBSD in that environment. (I'm getting
		dizzy... :-)
20040417	Moving some coprocessor stuff from cpu.c to coproc.c.
20040424	Adding a BT455 vdac for PMAG-AA. Black and white are now
		rendered correctly in Xpmax.
		Adding colormap support to the BT459 device, for PMAG-BA.
20040425	Fixing a buffer length bug, which caused an Ultrix 4.5
		install to bug out on an i386 host.
20040429	FPU experiments.
20040502	More FPU experiments.
		Speedup for exception debug messages:  in quiet mode, debug
		messages were still evaluated, which took a relatively
		large amount of time.
20040503	Most FPU stuff fixed, but there is at least one known bug
		left; ps axu in NetBSD triggers it (ps loops forever).
20040504	A default install of Ultrix 4.5 succeeded! It boots up with
		a graphical login.
		Fixing the keyboard repetition bug (a lk201 "up" (release)
		scancode is now sent after every key).
20040505	Both CR and LF now produce the same lk201 scancode, so that
		pressing 'enter' works as expected in Ultrix.
20040506	Adding a vaddr to paddr translation cache, causing a speedup
		of perhaps 50% or more.
20040507	Fixing PMAG-BA color for Ultrix. (Ultrix relies on interrupts
		coming from the TURBOchannel slot to update the palette.)
20040508	Fixing cursor positioning for PMAG-BA.
20040511	Prints current nr of instructions per seconds, not only
		average, when using -N.
20040515	Some more bintrans experiments.
20040606	Adding ARCBIOS GetReadStatus() and Read().
		Adding some instructions: tlt, tltu, tge, tgeu, tne.
20040607	Adding the dsub instruction.
		Some minimal progress on SGI-IP30 emulation.
		Applying a patch from Juli Mallett to src/file.c (I'm not
		sure yet if it breaks or fixes anything).
		Some minor fixes for SGI-IP22 (such as faked board revision
		numbers).
20040608	ll/sc should now fail if any unrelated load/store occurs.
		Minor changes to the configure script.
		Adding some ifdefs around code which is not often used
		(the mfhi/mflo delay, and the last_used TLB experimental
		code); this might cause a tiny speedup.
20040609	Minor fixes.
20040610	Various minor SGI fixes (64-bit ARCS stuff, progress on the
		CRIME/MACE interrupt system, and some other random things).
20040611	More crime/mace progress, and some more work on pckbc.
		KN5800 progress: adding a XMI->BI adapter device; a disk
		controller is detected (but it is just a dummy so far).
20040612	Adding "dev_unreadable", which simplifies making memory
		areas unreadable. (NetBSD on SGI-IP22 no longer detects
		non-existant hpc1 and hpc2 busses.)
		Implementing rudimentary support for IP22 "local0" and
		"local1" interrupts, and "mappable" local interrupts.
		Some progress on the WDSC SCSI controller on IP22, enough
		to let NetBSD get past the disk detection and enter
		userland!  :-)
		The zs (zilog serial) device now works well enough to let
		NetBSD/sgimips be interacted with on IP22. :-)  (Though
		it is very ugly and hardcoded.)
20040613	IP32 didn't work last night, because there were too many
		tick functions registered. That has been increased now.
		Trying out NetBSD/sgimips 2.0 beta kernels. There are some
		differences compared to 1.6.2, which I'm trying to solve.
		Interrupt fixes for IP32: _serial and _misc are different.
		Separation of IP22 (Full-house) and IP24 (Guiness).
20040614	Modifying the memory layout for IP20,22,24,26 (RAM is now
		offset by 128MB, leaving room for EISA registers and such),
		and moving around some code chunks. This is not well
		tested yet, but seems to work.
		Moving parts of the tiny translation cache, as suggested
		by Juli Mallett.  It seems that the speedup isn't as
		apparent as it was a few weeks ago, though. :-(
		Speedups due to not translating addresses into symbol
		names unless the symbol name is actually printed.
		Added support for loading old big-endian (Irix) ECOFF
		kernels (0x60 0x01 as the first two bytes).
20040615 (late)	Adding enough SGI IP20 (Indigo) support to let NetBSD 2.0
		enter userland :-)  No interrupt specifics are implemented
		yet, so it hangs while doing terminal output.
20040618	Experimenting with the WDSC SCSI controller for IP20,22,24.
20040620	Adding a program which converts SGI prom dumps from text
		capture to binary, and some hacks to try to make such an
		IP22 PROM to work better in the emulator.
20040621	Removing the Nintendo 64 emulation mode, as it is too
		uninteresting to support.
		Adding SCSI tape device support (read-only, so far).
		Fixing a bug which caused the cursor to be corrupted if new
		data was written to the framebuffer, but the cursor wasn't
		moved.
20040622(early)	Finally! Making progress on the SCSI tape stuff; when going
		past the end of a file, automagically switch to the beginning
		of the next.
20040622(late)	Trying to track down the last SCSI tape bugs.
		Removing _all_ dynamic binary translation code (bintrans),
		starting from scratch again.
20040623(early)	Performing a general code cleanup (comments, fixing stuff
		that led to compiler warnings, ...).
		Disabling MIPS16 support by default, and making it a
		configure time option to enable it (--mips16). This gives
		a few percent speed increase overall.
		Increasing performance by assuming that instruction loads
		(reading from memory) will be at the same page as the last
		load.  (Several percent speedup.)
		Moving the list of kernels that can be found on the net from
		README to doc/.
20040624	Finally! I found and fixed the bug which caused 'ps', 'top',
		'xclock', and other programs in NetBSD/pmax to behave weird.
		Increasing performance by a few percent by running as many
		instructions in a row as possible, before checking for
		hardware ticks.
		When booting from SCSI tapes on DECstation, the bootstring
		now contains 'tz' instead of 'rz'.
		Adding a second ARC machine mode, "Acer PICA-61", -A2.
		Disabling the support for "instruction delays" by default
		(it has to be enabled manually in misc.h now, but is never
		used anywhere anyway).
		Other minor optimizations (moving around stuff in the
		cpu struct in misc.h, and caching cpu->pc in cpu.c).
		Separating the tiny translation cache into two, one for
		code and one for data. This gives a few percent speed
		increase.
20040625(early)	I think now is a good time for a "feature freeze",
		to let the code stabilize and then make some kind of
		first release.
20040625(later)	Adding a -v (verbose) command line option. If -v is not
		specified, the emulator goes into -q (quiet) mode just before
		it starts to execute MIPS code.
20040627	The configure script now adds -fomit-frame-pointer to the
		compile flags if the $CC seems to be able to handle that.
		Found and fixed a serious interrupt bug in BT459 (Ultrix'
		behaviour required a hack, which was incorrect), so
		performance for machines using the PMAG-BA framebuffer is
		now improved.
		For X11 bitdepths other than 8 or 24, a warning message
		is printed at startup.
		A number of other minor fixes, optimizations, updated
		comments and so on.
		Adding a BUGS file, a list of known bugs.
		Adding a minimal man page, doc/mips64emul.1.
20040628	Hacks for faking the existance of a second level cache
		(ARCBIOS and other places).
		An important fix for dc7085: tx interrupts should happen
		before rx interrupts, not the other way around as it was
		before. (This speeds up NetBSD boot on DECstation, and
		fixes a bug which Ultrix triggered on heavy keyboard input.)
		A couple of other minor fixes.
		Framebuffer fix: there was a bug which caused the rightmost/
		bottom pixel to sometimes not be updated, when running in
		scaledown mode. This is now fixed.
		Adding a small program which removes "zero holes" from
		harddisk image files.
20040629	More minor fixes.
20040629(later)	Adding -A3 (NEC RISCstation 2200) (this is similar to
		the 2250 model that NetBSD/arc can already boot all the
		way into userland and be interacted with), and -A4
		(Deskstation Tyne).
		Some more minor fixes.
20040630	Adding support for 15 and 16 bits X11 framebuffers,
		and converting from XYPixmap to ZPixmap (this fixes the
		problem of updates appearing in "layers" on some X
		servers).
		The pixels in the mouse cursor (for BT459) are now colored
		as the emulated OS sets them, although no transparency
		masking is done on the edges of the cursor yet. (In plain
		English:  the mouse cursor is no longer just a white solid
		square, you can actually see the mouse cursor image
		on the white square.)

==============  RELEASE 0.1  ==============

20040701	The -j option now takes a name, the of the kernel as passed
		on to the bootloader.  ("netbsd" is the default name.)
		Adding support to load bootstrap code directly from a disk
		image, for DECstation.  Both NetBSD/pmax and Ultrix boot
		straight of a disk image now, with no need to supply a
		kernel filename on the command line.  (Ultrix still needs
		-j vmunix, though, to boot from /vmunix instead of /netbsd.)
20040702	Minor bugfix (some new untested code for X11 keypresses was
		incorrect).
20040702(later)	Adding an ugly hack for CDROMs in FreeBSD; if an fread() isn't
		done at a 2048-byte aligned offset, it will fail. The hack
		tries to read at 2048-byte aligned offsets and move around
		buffers to make it work.
		Adding video off (screen blanking) support to BT459.

==============  RELEASE 0.1.1  ==============

20040702(later)	Cleanup to remove compiler warnings (Compaq's cc, Solaris' cc,
		and gcc 3.3.3/3.3.4 in Linux), mostly by putting ULL on large
		numeric constants.
		Better support for scaledown of BT459 cursors, but still not
		color-averaging.
		Beginning the work on adding better memory latency support
		(instruction delays), enabled by the --delays configure option.
20040703	Modifications to the configure script so that a config.h file
		is created, containing things that were passed along as
		-Dxxx on each cc command line before.
		More work on instruction latency support; trying to separate
		the concepts of nr of cycles and nr of instructions.
20040704	Working on R2000/R3000 caches.
		Adding a '--caches' option to the configure script.
		Various small optimizations.
		R3000 caches finally work. (I know that there is at least one
		bug, regarding interrupt response.)
20040705	Working on the 'le' device, and on a generic (device
		independant) networking framework. le can transmit and receive
		packets, and the network framework fakes ARP responses from a
		fake gateway machine (at a fixed ip address, 10.0.0.254).
		Adding a '-c' command line option, which makes emulated_hz
		automatically adjust itself to the current number of emulated
		cycles per host CPU second (measured at regular intervals).
20040707	Removing the '-c' option again, and making it the default
		behaviour of the emulator to automatically adjust clock
		interrupts to runtime speed (as long as it is above 1 MHz).
		(This can be overridden by specifying a static clock rate with
		the -I option.)
		Updating the doc/ stuff a bit.
		Generalization of the DECstation bootblock loading, to work
		with Sprite/pmax. Lots of other minor modifications to make
		Sprite work, such as adding support for DECstation "jump table"
		PROM functions, in addition to the old callback functions.
		Sprite boots from a disk image, starting the kernel if the
		argument "-j vmsprite" is used, but it seems to not like the
		DBE exceptions caused by reading empty TURBOchannel slots. :-/
20040708	Minor changes and perhaps some tiny speed improvements.
		The Lance chip is (apparently) supposed to set the length of
		received packets to len+4. (I've not found this in any 
		documentation, but this is what NetBSD expects.) So now, ICMP
		echo replies work :-)  UDP works in the outgoing direction,
		in the incoming direction, tcpdump can see the packets but they
		seem to be ignored anyway. (Weird.)
		Adding a separate virtual-address-to-host-page translation
		cache, 1-entry for loads, 1-entry for stores. (For now, it
		only works on R4000 as there are conflicts with cache usage
		on R3000).
		Changing the lower clock speed bound from 1 MHz to 1.5 MHz.
20040709	Incoming UDP checksums were wrong, but are now set to zero
		and NetBSD inside the emulator now accepts the packets (eg.
		nameserver responses).  Host lookups and even tftp file
		transfers (using UDP) work now :-)
		Adding a section on Networking to the Technical documentation,
		and a preliminary NetBSD/pmax install instruction for network
		installs to the User documentation.
		Some updates to the man page.
20040709(later)	Fix to the TURBOchannel code to allow Sprite to get past the
		card detection. Seems to still work with Ultrix and NetBSD.
		This also makes Linux/DECstation properly recognize both the
		Lance controller and the SCSI controller. Linux 2.4.26 from
		Debian boots nicely in framebuffer mode :-)
20040710	Some bits in the KN02 CSR that were supposed to be readonly
		weren't. That has been fixed, and this allows Linux/DECstation
		to get past SCSI detection. :-)
		Minor updates to the ASC controller, which makes Linux and
		OpenBSD/pmax like the controller enough to be able to access
		SCSI devices. OpenBSD/pmax boots from a disk image for the
		first time. :-)  Linux detects SCSI disks, but I have no
		bootable Linux diskimage to test this with.
		Updating the doc/ to include instructions on how to install
		OpenBSD/pmax onto a disk image.
		Naively added a PMAGB-BA (1280x1024x8) in hopes that it would
		basically be a PMAG-BA (1024x864x8) in higher resolution,
		but it didn't work that way. I'll have to look into this later.
		Adding a -o option, useful for selecting '-s' (single user
		mode) during OpenBSD install and other things.
		After a lot of debugging, a serious bug related to the tiny
		cache was found; Linux just changes the ASID and returns when
		switching between processes in some occasions without actually
		_writing_ to the TLB, and I had forgotten to invalidate the
		tiny cache on such a change.
20040711(early)	I've been trying to repeat the OpenBSD install from yesterday,
		but appart from the first initial install (which was
		successful), I've only been able to do one more. Several
		attempts have failed with a filesystem panic in the middle
		of install. I'm not sure why.
20040711	I found the "bug": wget downloaded the simpleroot28.fs.gz file
		as read-only, and gunzip preserved those flags. Thus, OpenBSD's
		installer crashed as it didn't get its writes through to the
		disk.
		Parts of the 1280x1024x8 PMAGB-BA graphics card has been
		implemented, it works (unaccelerated) in NetBSD and OpenBSD,
		but Ultrix does not seem to like it.
		Cleaned up the BT459 cursor offset stuff a bit.
		Trying to make the emulated mouse coordinates follow the host's
		mouse' coordinates (for lk201, DECstation), by
		"de-accelerating" the data sent to the emulated OS.
20040711(later)	Fix so that Sprite detects the PMAG-BA correctly.
		Adding some stuff about NFS via UDP to the documentation.
		Fixed the 'update flag' for seconds, so now Sprite doesn't
		crash because of timer-related issues anymore.
		Fixing KN02 interrupt masks a bit more, to make Sprite not
		crash. Sprite now runs quite well.
20040712	Working on IP/UDP fragementation issues. Incoming UDP packets
		from the outside world can now be broken up into fragments
		for the guest OS. (This allows, for example, OpenBSD/pmax to
		be installed via nfs.)  Outgoing fragmented packets are NOT
		yet handled.
		Linux doesn't use 64-bit file offsets by default, which is
		needed when using large disk images (more than 2GB), so the
		configure script has now been modified to add necessary
		compiler flags for Linux.
20040713	Trying out some minor optimizations.
		Refreshing the UDP implementation in src/net.c a little.
20040714	Updating the documentation a little on how to experiment
		with a Debian Linux install kernel for DECstations.
		A 'mini.iso' Linux image for DECstation has different fields
		at offsets 0x10 and 0x14, so I'm guessing that the first is
		the load address and the second is the initial PC value.
		Hopefully this doesn't break anything.
		Some initial TCP hacks, but not much is working yet.
		Some updates for IP30:  The load/store 1-entry cache didn't
		work too well with IP30 memory, so it's only turned on for
		"MMU4K" now. (This needs to be fixed some better way.)
		Adding a hack which allows Linux/Octane to use ARC write()
		and getchild() on IP30. Linux uses ARCBIOS_SPB_SIGNATURE as a
		64-bit field, it was 32-bit before.
		Making ugly hacks to the arcbios emulation to semi-support
		64-bit equivalents of 32-bit structures.
20040716	Minor fixes to the configure script (and a few other places)
		to make the sources compile out-of-the-box on HP-UX (ia64
		and HPPA), old OpenBSD/pmax (inside the emulator itself), and
		Tru64 (OSF/1) on Alpha.
		A couple of other minor fixes.
20040717	A little TCP progress; OpenBSD/pmax likes my SYN+ACK replies,
		and tries to send out data, but NetBSD/pmax just drops the
		SYN+ACK packets.
		Trial-and-error led me to change the 64-bit ARCS component
		struct again (Linux/IP30 likes it now). I'm not sure about all 
		of the offsets yet, but some things seem to work.
		More 64-bit ARCS updates (memory descriptors etc).
		Better memory offset fix for IP30, similar to how I did it for
		IP22 etc. (Hopefully this doesn't break anything else.)
		Adding a MardiGras graphics controller skeleton for SGI-IP30
		(dev_sgi_mardigras.c).
		Thanks to Stanislaw Skowronek for dual-licensing mgras.h.
		Finally rewrote get_symbol_name() to O(log n) instead of O(n)
		(Stanislaw's Linux kernel had so many symbols that tracing
		with the old get_symbol_name() was unbareably slow).
		Removing all of the experimental tlbmod tag optimization code
		(the 1-entry load/store cache), as it causes more trouble than
		the performance gain was worth.
20040718	The MardiGras device works well enough to let Linux draw the
		SGI logo and output text.
		A bunch of other minor changes.
20040719	Trying to move out all of the instruction_trace stuff from the
		main cpu loop (for two reasons: a little performance gain,
		and to make it easier to add a GUI later on).
20040720	Finally found and fixed the ethernet/tcp bug. The hardware
		address is comprised of 6 bytes, where the _first_ byte should
		have a zero as the lowest bit, not the last byte. (This causes
		NetBSD and Linux running in the emulator to accept my SYN+ACK
		packets.)
		Getting the first nameserver address from /etc/resolv.conf.
		(This is not used yet, but could be useful if/when I add
		internal DHCP support.)
		Working more on the TCP stuff; TCP seems to be almost working,
		the only immediate problem left is that the guest OS gets
		stuck in the closing and last-ack states, when it shouldn't.
		It is now possible to install NetBSD and OpenBSD via ftp. :-)
20040721	Trying to fix the last-ack bug, by sending an RST after the
		end of a connection. (Probably not a correct fix, but seems
		to work?)
		Adding a my_fseek() function, which works like fseek() but
		with off_t instead of long, so that large disk images can
		be used on systems where long is 32 bits.
20040722	Trying to fix some more TCP related bugs.
20040725	Changing the inlined asm statement in bintrans_alpha.c into
		a call to a hardcoded array of bytes that do the same thing
		(an instruction cache invalidation). This allows the same
		invalidation code to be used regardless of compiler.
		Some other minor changes.
20040726	Minor updates. The configure script is now more verbose.
		A Debian/IP22 Linux tftp boot kernel requires ARCS memory to
		be FreeMemory, not FreeContiguous. (This should still work with
		other SGI and ARC OSes.)
		Fix for ARCS write(), so it returns good write count and
		success result (0).
		Some hacks to the IP22 memory controller, to fake 72MB RAM
		in bank 0.
		The IP22 Debian kernel reaches userland (ramdisk) when run
		with -G24 -M72 -CR4400, if a special hack is done to the
		zs device.
20040730	Removing mgras.h, as I'm not sure a file dual-licensed this way
		would work. (Dual-licensing as two separate files would work
		though.)
		Preparing for the upcoming release (0.2).
20040801	Fixing the 512 vs 2048 cdrom sector size bug; I hadn't 
		implemented the mode select SCSI command. (It still isn't
		really implemented.)
		A bug which crashes the emulator is triggered when run with
		new NetBSD 2.0_BETA snapshots on a Linux/i386 host. I'm not
		sure why.
		UDP packets sent to the gateway (at 10.0.0.254) are now
		forwarded to the machine that the host uses as its nameserver.
		Some other minor fixes.

==============  RELEASE 0.2  ==============

20040803	A post-3.5 OpenBSD/sgimips kernel snapshot with ramdisk seems
		to boot fine in the emulator, all the way to userland, and
		can be interacted with.
		Adding a -y option, used to set how many (random) instructions
		to run from each CPU at max (useful for SMP instruction
		interleave experiments).
		Importing a 8x16 console font from FreeBSD (vt220l.816).
		Adding a skeleton for a 80x25 text console device (dev_vga),
		useful for some ARC modes. (Character output is possible, but
		no cursor yet.)
		Adding a dev_zero device (returns zeroes on read).
		OpenBSD/arc 2.3 can get all the way to userland with -A4 (if
		the wdc devices are commented out) but bugs out there, probably
		because of interrupt issues.
		Adding a -A5 ARC emulation mode (Microsoft-Jazz, "MIPS Magnum")
		which NetBSD seems to like. No interrupt specifics yet, so
		it hangs while waiting for SCSI.
20040804	Some dev_mp updates.
		The -y switch has to do with number of cycles, not number
		of instructions; variable names have been changed to reflect
		this.
20040805	Minor updates. Adding some more CPU types/names, but they
		are probably bogus.
		Adding a MeshCube emulation mode. Just a skeleton so far, but
		enough to let a Linux kernel print some boot messages.
		Adding the 'deret' instruction.
20040806	Adding include/impactsr-bsd.h (a newer version of what was in
		mgras.h before, and this time with only a BSD-style license),
		but it is not used yet.
20040810	Some Au1500 updates.
20040811	Adding the 'clz', 'clo', 'dclz', and 'dclo' special2 (MIPS32
		and MIPS64) instructions.
		More Au1500 updates.
20040812	Using fseeko(), when it is available.
		Other minor updates.
		Adding a NetGear WG602 emulation mode skeleton (-g); after
		a lot of trial and error, a Linux kernel (WG602_V1715.img)
		gets all the way to userland, but hangs there.
20040818	Adding signal handlers to better cope with CTRL-Z and CTRL-C.
		Adding a simple interactive single-step debugger which is
		activated by CTRL-C. (Commands so far: continue, dump, help,
		itrace, quit, registers, step, trace, version)
20040818(later)	Adding a 'tlbdump' debugger command, and some other minor
		fixes.
20040819	Minor updates. Adding an 'unassemble' debugger command.
20040822	Minor updates to the regression testing framework.
20040824	Minor updates based on feedback from Alec Voropay
		(configure script updates for Cygwin and documentation).
20040826	Minor updates.
		Adding a cursor to the VGA text console device.
		Changing all old 11:22:..55:66 ethernet macs to 10:20..60,
		still hardcoded though.
20040828	Minor updates.
20040829	mips64emul is 1 year old today :-)
20040901	tests/README now lists "all" MIPS opcodes. This list should
		be updated whenever a new opcode is implemented, or when a
		regression test is added. (A combination of instructions from
		the TX79 manual, the GNU assembler, and the MIPS64 manual).
		Hopefully I haven't missed too many.
		Adding a section on regression testing to doc/technical.html.
20040902	Finally beginning the work on separating out the stuff from
		main.c into a "struct emul". Very time-consuming.
		Some minor fixes for LL/SC on R10000.
20040905	Moving more stuff from main.c into struct emul. Unfortunately,
		it seems that this causes a slowdown of the emulator.
		Userland emulation is now only used if --userland is used
		when running configure.
		Modifying src/symbol.c to not use global variables.
20040906	Minor update.
20040914	Using $COPTIM when detecting which compiler flags to use in
		the configure script. (This makes sure that combinations of
		flags should work.)
		There'll probably be a 0.2.1 release some time soon, but I'll
		do some more clean-up first.
		Minor update to the detection of ECOFF files, but I don't like
		it; sometimes the endianness of the magic value seems to be
		swapped, but it doesn't have to do with endianness of the
		actual data?
20040916	Minor updates. Adding an Example section to the manpage, but
		as I'm not really familiar with manpage formatting, it will
		need to be rewritten later.
20040917	Finally making the coprocessor instructions disassemblable
		even when not running.
		Doing some testing for the 0.2.1 release.

==============  RELEASE 0.2.1  ==============

20040923	Updating the documentation about how to (try to) install
		Debian GNU/Linux.
20040924	Some more updates to the documentation.
20040925	Adding overflow stuff to 'add' and 'sub'.
20040926	Minor updates: possibly a fix to 'sltiu' (the imm value
		should be treated as signed, and then converted to unsigned,
		according to the MIPS64 manual), and removing the
		'last_was_rfe' stuff (again).
		OpenBSD/arc used speed-hack jumps with other deltas than just
		+/- 1 (it used -3 iirc), so the jump speedhack should now
		support any delta. Also adding bgtzl and blezl as possible
		instructions for the speed-hack jumps. (This needs to be
		tested more.)
20040928	Minor updates. Some ARC stuff ("arcdiag" runs now).
		cpu_register_dump() now also dumps coprocessor registers.
20040929	More ARC updates. Making the code look a tiny bit nicer
		than before. "arcdiag.ip22" works for -G22 (SGI-IP22).
		Apparently the overflow support in the 'add' instruction
		was incorrect, so I disabled it.
20041002	Trying to install Ultrix in the emulator, but the installer
		crashes; found (and fixed) the bug rather quickly: the "fix"
		I implemented a few days ago for the 'sub' instruction
		(according to the MIPS64 manual) caused the bug.
20041004	Changing the behaviour of the -j command line option. The
		default is now "" (or taken from the last filename given on
		the command line), not "netbsd". In practice, this doesn't
		change much, except that -j netbsd.pmax is no longer needed
		when installing NetBSD.
		Adding a COMPILE_DATE string to config.h.
20041007	Adding a NEC RISCserver 4200 model (-A6), and some more
		updates to the ARC component tree generator.
20041008	The 'll' instruction should be signed, not unsigned as before.
		This (and some other minor fixes) causes Irix on SGI-IP32 (O2)
		to actually boot far enough to print its first boot messages :)
		Working on some new dynamic bintrans code. Enough is now
		implemented so that the 'nop' instruction is translated
		and there is support for Alpha, i386 and UltraSparc backends,
		but performance is about 50% worse than when running without
		bintrans. (This is as expected, though.)
20041009	Minor updates to the documentation.
		Using mprotect() to make sure that the code created dynamically
		by the bintrans subsystem is allowed to be executed. (This
		affects newer OpenBSD systems, and possibly others.)
		The translated code chunks now only get one argument passed to
		them, the (struct cpu *) of the current cpu.
20041010	Hack to dev_le.c which makes Ultrix accept the initialization
		of the LANCE controller. (This goes against the LANCE
		documentation though.)
		In src/net.c, a fix for Ultrix (which seems to send larger
		ethernet packets than the actual TCP/IP contents). The hack to
		dev_le.c and this fix is enough to let Ultrix access the
		Internet.
		For DECstation, when booting without a disk image (or when
		"-O" is used on the command line), use "tftp" instead of "rzX"
		for the boot string.
20041011	Adding cache size variables to the emul struct, so that these
		can be set on a per-machine basis (or potentially manually
		on the command line).
20041012	Mach/PMAX now passes the LK201 keyboard self-test (although
		the keyboard ID is still bogus).
20041013	Minor updates.
		Hacks to the ASC SCSI controller for Mach/PMAX, hopefully this
		will not break support for other OSes.
20041014	Minor fix to src/emul.c for reading bootblocks at the end of
		a disk or cdrom image (thanks to Alexandru Lazar for making me
		aware of this).
		Adding "gets()" to src/dec_prom.c.
		Working a bit on ARC stuff. Importing pica.h from NetBSD.
		Minor updates to the ARC component tree for PICA-61.
		Adding a dev_jazz.c (mostly for PICA-61).
		Renaming dev_jazz.c into dev_pica.c. Working on PICA timer
		and interrupt specifics.
20041016	Adding some dummy entries to lk201.c to reduce debug output.
		Some bintrans updates (don't run in delay slots or nullified
		slots, read directly from host memory and not via memory_rw(),
		try mmap() before malloc() at startup, and many other minor
		updates).
		Adding bintrans_mips.c for 64-bit MIPS hosts, but it is not
		used yet.
20041017	Minor updates.
20041018	Update to dev_mc146818 to allow Mach to boot a bit further.
		The "hardware random" in dev_mp.c now returns up to 64 bits
		of random.
20041019	Minor updates to the way cache sizes are used throughout the
		code. Should be mostly ok for R[234]x00.
		src/file.c now loads files using NO_EXCEPTIONS. Whether this
		is good or bad, I'm not sure.
20041020	Adding a Linksys WRT54G emulation skeleton (-H).
20041021	Minor updates.
		R1[024]000 cache size bits in the config register should now
		be ok.
		Trying to make dev_asc.c work better with PICA.
		More work on PICA interrupts (but they are broken now).
20041022	Generalizing the dev_vga text console device so that it can be
		used in other resolutions than just 80x25. Works with
		OpenBSD/arc.
		emul->boot_string_argument is now empty by default (except
		for DECstation modes, where it is "-a").
		Speedup of dev_ram by using mmap() instead of malloc().
		Using mmap() in memory.c as well, which reduces memory usage
		when emulating large memory sizes if the memory isn't actually
		written to.
20041023	Minor updates.
20041024	Updates to the PC-style keyboard controller, used by PICA.
		Updates to the PICA (Jazz) interrupt system. Both NetBSD/arc
		and OpenBSD/arc now reach userland with PICA emulation, and
		can be interacted with (there are a few programs on the
		INSTALL kernel ramdisks). In the case of OpenBSD, a VGA text
		console and PC-style keyboard controller is used, NetBSD
		runs on serial console.
		Adding a framework for DMA transfer for the ASC SCSI
		controller.
		Implementing a R4030 DMA controller for PICA, enough to let
		OpenBSD/arc and NetBSD/arc be installed on an emulated
		Pica. :-)
		Updates to the documentation.
20041025	Working on ISA interrupts for PICA.
		Adding an Olivetti M700 emulation mode (-A7).
		Better separation of PICA and M700 stuff (which I accidentally
		mixed up before, I thought the M700 Linux kernel would 
		also work on PICA because it almost booted).
		Writing a skeleton G364 framebuffer for M700, enough to show
		the Linux penguin and some text, although scrolling isn't
		correctly implemented yet.
		Adding a dummy SONIC (ethernet) device, dev_sn, for PICA.
		Fixing the passing of OSLOADOPTIONS for ARC, the default is
		now "-aN" which works fine with OpenBSD/arc and NetBSD/arc.
20041027	Minor updates.
20041029	Adding a Sony NeWS "newsmips" emulation mode skeleton (-f).
		Found and fixed a bug which prevented Linux/IP32 from running
		(the speed-hack-jump-optimization fix I made a few weeks ago
		was buggy).
		Adding the trunc.w.fmt and trunc.l.fmt instructions, although
		the are probably not really tested yet.
		Changes to how floating point values are handled in
		src/coproc.c, but right now it is probably very unstable.
20041101	I had accidentally removed the instructions on how to install
		Ultrix from doc/index.html. They are back now.
		Adding a -Z option, which makes it easier to run dual- or
		tripple-head with Ultrix. (Default nr of graphics cards
		without -X is 0, with -X is 1.)
		Minor update which makes it possible to switch to the left
		monitor when running tripple-head, not just right as before.
		When using more than one framebuffer window, and the host's
		mouse cursor is in a different window than the emulated mouse
		cursor, the emulated mouse will now try to move "very far",
		so that it in practice changes screen.
		Running Ultrix with dual- and tripple-head now feels really
		great.
20041101(later)	OpenBSD/arc and Linux/Olivetti-M700 don't both work at the
		same time with the speed-hack stuff. So, from now on, you
		need to add -J for Linux, and add nothing for openbsd.
20041102	Minor update for OSF/1 V4.0 (include sys/time.h in src/net.c
		and add -D_POSIX_PII_SOCKET to the C compiler flags).
20041103	Minor updates for the release.
		For some reason, Mach/PMAX caused the emulator to bug out on
		SunOS/sparc64 (compiled in 64-bit mode); a minor update/hack
		to dev_asc fixed this.

==============  RELEASE 0.2.2  ==============

20041103	Minor updates.
20041104	Minor updates.
20041105	Running with different framebuffer windows on different X11
		displays works now (even with different bit depths and
		endiannesses on different displays). A new command line option
		(-z) adds DISPLAYs that should be used.
		Update regarding how DECstation BT459 cursors are used;
		transparency :-) and some other bug fixes.
20041106	More bt459 updates. The cursor color seems to be correct for
		NetBSD, OpenBSD, Ultrix, and Sprite.
		Some minor bintrans updates (redesigning some things).
20041107	More bintrans updates (probably broken for non-Alpha targets).
		Moving doc/mips64emul.1 to man/.
20041108	Some updates.
20041109	More updates. Bintrans experiments mostly.
20041110	Some minor bintrans updates.
20041111	Minor updates.
20041112	A little rewrite of the bintrans system (again :-), this time
		a lot more naďve and non-optimizing, in order to support delay
		slots in a much simpler way.
		Ultrix 4.5 boots into a usable desktop on my home machine in
		3min 28sec, compared to 6-8 minutes without bintrans.
20041113	Some minor bintrans updates.
20041114	More bintrans updates. Ultrix now boots in exactly 3 minutes
		on my home machine.
20041115	More bintrans updates.
20041116	Bintrans updates.
20041117	Working on dev_dec_ioasic and related issues.
		Adding support for letting translated code access devices in
		some cases (such as framebuffers).
20041118	Moving some MIPS registers into Alpha registers, which gives
		a speed improvement.
		Beginning to write an i386 bintrans backend. Skeleton stuff
		works, lui, jr/jalr, addiu/daddiu/andi/ori/xori, j/jal,
		addu/daddu/subu/xor/or/nor/and.
20041119	dsubu/sll/srl/sra, rfe,mfc0,dmfc0, beq,bne, delayed branches.
		Some load/store (but not for bigendian emulation yet.)
		Time to reach Ultrix 4.5's graphical login on a 2.8 GHz Xeon
		host is now down to 20 seconds!
		Adding bgez, bltz, bgtz, and blez to the i386 backend.
20041120	Minor updates (bintrans related mostly).
		Time to reach Ultrix login on the Xeon is now 11 seconds.
		Adding 'mult', 'multu' and a some parts of mtc0 to the Alpha
		backend.
		The transparency updates to the X11 cursor support made the
		OpenBSD/arc cursor disappear; that has been fixed now.
		Unfortunately, something with Acer Pica emulation is broken
		when bintrans is enabled.
20041121	Making tlbwr, tlbwi, tlbp, tlbr callable directly from
		translated code.
		Adding sltiu, slti, slt, and sltu to the i386 backend.
20041122	More bintrans updates.
		With the Alpha backend, the status and entryhi registers
		can (in some cases) be written without exiting to the main
		loop. Ultrix boot time until a usable desktop is reached
		is about 1 min 35 seconds on the 533 MHz pca56.
		Adding srlv, srav, and sllv to the i386 backend.
20041123	Removing the special handling of high physical addresses for
		DECstation emulation from the main memory handling code, and
		replacing it with a mirror device instead. (This results in
		a tiny increase in performance, and cleaner code.)
		Various minor updates.
20041124	Ripping out _all_ bintrans load/store code, because I have
		a new idea I'd like to try out.
		A total rewrite of the load/store system. It works when
		emulating 32-bit MIPS, but not for 64-bit code yet.
		Some minor updates to the dev_fb, but no speed improvement.
		Making the 'le' ethernet device' SRAM work with bintrans.
20041125	Various updates.
		Adding a little "bootup logo" to the framebuffer.
		There is now one translate_address() for R3000-style MMUs,
		and one for the other types. (This gives a tiny speed
		improvement.)
20041126	Minor updates, bintrans.
		Fixing the bug which caused OpenBSD/arc (R4000) to bug out;
		it was introduced between the 7:th and 10:th of November
		when moving up the check for interrupts to above the code
		which runs bintrans code, in src/cpu.c.
		Adding movn and movz to the Alpha bintrans backend.
20041127	Various minor updates.
20041128	Making the R2000/R3000 caches work with bintrans, even in
		isolated mode. (Not true cache emulation, but it works with
		NetBSD/pmax, OpenBSD/pmax, and Ultrix.)
		Making the default cache size for R3000 4KB instr, 4 KB data;
		a real R3000 could have 64KB each, but emulated OSes run
		faster when they think the cache is smaller :-)
		Updates to the i386 backend: the nr of executed instructions
		is now placed in ebp at all times, and some support for
		mtc0 similar to how it is done in the Alpha backend has been
		added. A full NetBSD/pmax 1.6.2 install can now be done in
		5 minutes 35 seconds, on a 2.8 GHz Xeon host (with -bD2 -M20).
		Adding mult and multu to the i386 bintrans backend.
		Reducing the number of malloc/free calls used by the
		diskimage subsystem.
20041129	Minor updates to the Alpha bintrans backend.
20041130	Trying to fix the bug which prevents Linux from working
		with bintrans. It _seems_ to work now. (Pages could in some
		cases be written to after they were translated, but that
		has been fixed now.)
		A couple of other minor fixes.
		Minor updates to the Alpha backend (directly using Alpha
		registers in some cases, instead of loading into temporaries).
		Updates to the i386 backend (special hacks for 32-bit
		MIPS emulation, which are fast on i386, for example only
		updating half of the pc register).
20041201	More updates to the i386 backend, similar to those yesterday.
		Preparing for release 0.2.3.
		Adding a generic load/store mechanism, which is used when the
		32-bit optimized version cannot be used (for R4000 etc).

==============  RELEASE 0.2.3  ==============

20041202	If ALWAYS_SIGNEXTEND_32 is defined in misc.h, and an
		incorrectly extended register is detected, the emulator now
		exits instead of continues.
		Removing the LAST_USED_TLB_EXPERIMENT stuff.
		Minor updates to work better with Windows NT's ARCINST.EXE;
		printing 0x9b via arcbios becomes ESC + '[', and the ARC
		memory descriptor stuff has been generalized a bit more.
		Adding arbios hacks for Open(), Seek(), GetRelativeTime(),
		and Read() to allow WinNT's SETUPLDR to read the filesystem
		on the diskimage used for booting.
20041203	Adding a terminal emulation layer which converts arcbios
		stdout writes to "VGA text console" cell characters. Seems
		to work with Windows NT and arcdiag.
		Adding a 8x8 font to dev_vga.
		Adding more ARC components to the component tree (for PICA
		emulation).
20041204	Minor updates.
		More updates to dev_vga. Adding a 8x10 font.
		Adding a hack so that the framebuffer logo is visible at
		startup with dev_vga. (It disappears at the first scroll.)
		A minor fix for a bug which was triggered when running
		dual- or tripple-head, on 2 or 3 actual X11 displays.
20041205	Fixing a bintrans bug.
		Some other minor updates (some of them bintrans related).
20041206	Moving the web page to http://gavare.se.
		Adding a hack for mmap() which supports anonymous mapping
		using /dev/zero, but not using MAP_ANON{,YMOUS}.
		Separating out opcodes.h, cop0.h, and cpu_types.h from misc.h.
20041207	Minor bintrans update. (In some cases, it isn't necessary
		to return to the main loop, when translating from a new page.)
		Some other minor i386 bintrans backend optimizations.
		And some other minor updates.
		i386 backend update: the lowest 32 bits of the pc register
		are now placed in an i386 register.
20041208	Adding GetConfigurationData() and some support for config
		data, to src/arcbios.c.
		Adding a bogus 0xbd SCSI command (used by Windows NT). It is
		not listed in http://www.danbbs.dk/~dino/SCSI/SCSI2-D.html.
		If the framebuffer cursor contains more than 1 color, then
		the host's X11 cursor disappears. (Nice for DECstation
		emulation with emulated X.)
		For ARC and SGI emulation, if an exception occurs before an
		exception handler is installed, the emulator now exits
		nicely (as suggested by Alec Voropay).
		A couple of minor updates to the ARCBIOS emulation subsystem.
		The single step debugger is now automatically entered when
		all CPUs have stopped running, unless there was a clean
		shutdown of some kind (PROM halt() call, or similar).
		Adding a -V option for starting up in a paused state, into
		the single-step debugger.
		Adding a note about 'mmon' to the documentation
		(http://www.brouhaha.com/~eric/software/mmon/).
20041209	Fixes to devices/console.c which makes cursor keys and such
		a bit more reliable.
		ARCBIOS hack/update which creates memory descriptors _after_
		loading the executable. (Seems to work with OpenBSD/arc,
		NetBSD/arc, arcdiag, IRIX, NetBSD/sgimips, OpenBSD/sgi, and
		some Windows NT executables.)
		ARCBIOS support for cursor keys (ESC + '[' ==> 0x9b).
		A bintrans update (for 32-bit emulation) which speeds up
		jumps between pages, if code is already translated.
		Changing the default bintrans cache from 20 to 24 MB.
20041210	Optimizing unaligned load/stores a little bit in src/cpu.c.
		Omiting the check for nr of executed bintrans instructions
		on some forward jumps.
		Adding the 'syscall' and 'break' instructions to the
		bintrans backends.
		Allowing more bits of the status register to be written to
		from within inside translated code, on R3000.
		Getting rid of the final pixel when hiding the host's mouse
		cursor.
		store_buf() now copies data 8 or 4 bytes at a time, when
		possible. (This speeds up emulated ROM disk reads, etc.)
		Tiny bug fix: coprocessor unusable exceptions are now also
		generated (for coproc 1..3) even when in kernel mode, if the
		coprocessors are not enabled. This allows a Debian installation
		to proceed further than before. (It's still very unstable,
		though.)
20041212	Updating doc/index.html with better Debian installation
		instructions.
		If SLOWSERIALINTERRUPTS is defined at compile time, interrupts
		from the dc7085 device will not come as often as they normally
		do. This makes Debian seem more stable.
		Decreasing the bintrans cache to 20 MB again.
		Updating some files in preparation for a 0.2.4 release.
20041213	Updating the docs on how to install NetBSD 2.0/pmax, and also
		some updates to the section on installing Debian.
		32-bit bintrans backend optimization: don't inline large
		chunks of code, such as general jumps.
20041214	Minor fix for coproc unusable for R4000 (it's the PC that,
		matters, not the KSU bits).
		Separating out the debugger from emul.c into debugger.c.
		Rewriting parts of the debugger.
		Removing the -U command line option, as it wasn't really
		useful. Also removing the -P option.
		Renaming all instances of dumppoint to breakpoint, as that
		is what it really is.
		When a breakpoint is reached, the single-step debugger is
		entered, instead of just turning on instruction trace.
		Adding a 'breakpoints' debugger command.
		Better fix for coproc unusable on R4000: the KSU bits matter,
		but the ERL and EXL bits override that.
		Fix which allows Debian to boot directly from a disk image
		(with DELO). (It reads multiple separate areas from disk.)
		Update to the SLOWSERIALINTERRUPTS stuff, making it even
		slower.
		Fixes based on feedback from Alec Voropay (-Q with ARC
		emulation skips the setup of arcbios data structures in
		memory, and no sign-extension _after_ writing a 32-bit
		value to a 64-bit coproc 0 register).
		Adding a 'devices' command to the debugger.
		The 'registers' and 'tlbdump' commands now take an optional
		argument (a cpu id).
		Adding rudimentary tab-completion and cursor key stuff to
		debugger_readline().
		Adding some more debugger commands: 'bintrans' and 'machine'.
20041215	Adding a 'devstate' command; implementing a skeleton for a
		state function for the bt459 device.
		Implementing yet another variant of the SLOWSERIALINTERRUPTS
		stuff.
		Implementing more of the different exception offsets (taking
		CAUSE_IV and STATUS_BEV into account).
		hpc_bootinfo should now be correctly filled on big-endian
		hosts.
		Always shift left by 12, not by pageshift, to get physical
		addresses on MMU4K etc. (Thanks to Alec Voropay for noticing
		this.)
20041216	The KN02's CSR can now be read from bintranslated code.
		Adding a dummy dev_sgi_mec.
20041217	The default framebuffer and model settings for -F (hpcmips)
		should now be almost like Cassiopeia E-500.
		Changing -DSLOWSERIALINTERRUPTS into a command line option, -U.
20041218	Continuing a little bit on the mec controller.
		Removing lots of #include <math.h> that weren't really used.
20041219	Fixing stuff that broke because of the pageshift bugfix.
		Adding an argument to the s (step) debugger command, for doing
		more than 1 step at a time.
		ARCBIOS components representing disk images are now created
		to actually match the disk images in use, and some other
		arcbios-related updates; adding a dummy GetComponent().
		Adding a 'lookup' command to the debugger, for symbol lookups.
		Adding a "NEC Express RISCserver" mode (NEC-R96, -A8).
		Adding a dummy ARCBIOS GetFileInformation(), GetTime(), and
		SetEnvironmentVariable().
20041220	Improved command line editing (including command history)
		in the debugger.
		Separating some more .h files from each other, and fixing
		some Solaris compiler warnings.
20041221	Minor updates.
20041222	Minor updates; hpcmips (BE300, VR41xx) stuff.
		The 'register' debugger command is now 'reg', and it can
		be used to modify registers, not just read them.
		The syntax for hpcmips (-F) is now -F xx, where xx is a
		machine model identifier. (1 = BE300.)
20041223	Some really minor updates.
20041226	Minor updates to doc/index.html (NetBSD 1.6.2 -> 2.0, and
		some other rearrangements).
		Many updates to the debugger (better register manipulation,
		breakpoint manipulation, and other updates).
		Fix to dev_cons.c to allow the regression tests to work again.
		The configure script now tries to detect the presence of a
		MIPS cross compiler. (Used by "make regtest".)
		Regression tests are now run both with and without bintrans.
20041227	Some hacks to the VR41xx code to allow Linux for BE300 to
		get far enough to show the penguin on the framebuffer.
20041228	Merging dev_kn01_csr.c and dev_vdac.c into dev_kn01.c.
20041229	Various updates to the debugger (nicer tlb output and other
		things).
		Some floating point fixes in src/coproc.c (mov is not
		an arithmetic instruction), and in src/cpu.c (ldcX/sdcX in
		32-bit mode uses register pairs).
		'-O' now also affects the bootstring for SGI and ARC emulation.
		Bintrans updates (slightly faster 32-bit load/store on alpha).
		Updates to the i386 backend too, but no real speed improvement.
20041230	Cleaning up parts of the 64-bit virtual-to-physical code for
		R10000, and per-machine default TLB entries can now be set
		for SGI and ARC machines.
		Fix: SGI-IP27 is ARC64, not ARCS.
20050101	Minor updates.
20050102	Minor updates.
		Fixing a 32-bit 'addu' bug in the bintrans backends.
		Allowing fast load/stores even in 64-bit bintrans mode, if
		the top 32 bits are either 0x00000000 or 0xffffffff (for Alpha
		only).
		Re-enabling ctc0/cfc0 (but what do they do?).
		Adding beql, bnel, blezl, and bgtzl to the Alpha backend.
20050103	Adding fast 32-bit load/store for 64-bit mode emulation to
		the i386 backend too (similar to the Alpha code). Not really
		tested yet, though.
		Adding an incomplete regression test case for lwl/lwr/ldl/ldr.
		Playing around with bintranslated lwl and lwr for Alpha.
20040104	Changing many occurances of pica to jazz.
		Various other updates.
20050105	Fixing some more bintrans bugs (both Alpha and i386).
		Unaligned stores that cause tlb refill exceptions should now
		cause TLBS exceptions, not TLBL.
		Adding experimental swl and swr to the Alpha backend.
		Adding lwl, lwr, swl, and swr to the i386 backend.
20050106	Adding another hpcmips model (Casio E-105, -F2), and doing
		some updates to the VR41xx code. NetBSD/hpcmips prints some
		boot messages.
20050108	Minor updates.
20050109	dev_dec5500_ioboard.c and dev_sgec.c => dev_kn220.c.
		dev_crime.c, _mace.c, and _macepci.c => dev_sgi_ip32.c.
		Also adding dev_sgi_mec, _ust, and _mte into dev_sgi_ip32.c.
		A slight license change. Still revised BSD-style, though.
		memory_v2p.c is now included separately for MMU10K and
		MMU8K.
		Fixing a NS16550 bug, triggered by NetBSD 2.0, but not 1.6.2.
		Refreshing the UltraSPARC bintrans backend skeleton.
		Merging dev_decbi, _deccca, and _decxmi into dev_dec5800.c.
		Sparc backend instructions done so far: mthi/mtlo/mfhi/mflo,
		lui, addu, daddu, subu, dsubu, and, or, nor, xor, sll, dsll,
		srl, and sra.
		Adding more sparc backend instructions: addiu, daddiu, xori,
		ori, andi, srlv, srav, sllv, slt, sltu, slti, sltiu.
20050110	Changing the default bintrans cache to 16 MB, and some other
		minor updates.
		Adding div and divu to the i386 backend (but not Alpha yet).
		More work on ARCBIOS emulation.
		Trying to find a bug which affects Linux on Playstation 2 in
		bintrans mode.
20050111	Moving around some Playstation 2 stuff, but I haven't found
		the bug yet. It is triggered by load/stores.
		More ARCBIOS updates, enough to let Windows NT partition
		disks in some rudimentary fashion.
20050112	Testing for release 0.2.4.
		Fixes to suppress compiler warnings.

==============  RELEASE 0.2.4  ==============

20050113	Minor updates.
20050114	Fix to the Alpha bintrans backend to allow compilation with
		old versions of gcc (2.95.4).

==============  RELEASE 0.2.4.1  ==============

20050115	Various updates and fixes: some IP32 stuff, the debugger,
		ns16550 loopback tx isn't transmitted out anymore, ...
		Removing old/broken R10000 cache hacks, which weren't really
		used.
20050116	Minor updates to the documentation on using PROM images.
		Adding ARCBIOS function 0x100 (used by IRIX when returning
		from main, but undocumented).
		MC146818 updates (mostly SGI-related).
		ARCS64 updates (testing with an OpenBSD snapshot in IP27
		mode). This causes Linux/IP30 to not work. Maybe IP27 and
		IP30 differ, even though both are 64-bit?
		Removing some nonsensical ARCS64 code from machine.c.
		Better handling of 128MB and 512MB memory offsets used by
		various SGI models.
		Trying to revert the ARCS64 changes (OpenBSD/sgi does
		seem to be aware of 64-bit vs 32-bit data structures in
		_some_ places, but not all), to make Linux/IP30 work again.
		Adding "power off" capability to the RTC, as used on IP32
		(and possibly IP30 and others).
		Some IP30 updates.
20050117	Debugger updates (symbolic register names instead of just rX,
		and using %08x instead of %016llx when emulating 32-bit CPUs
		in more places than before).
		Removing the dummy sgi_nasid and sgi_cpuinfo devices.
		Also using symbolic names for coprocessor 0 registers.
		Adding DEV_MP_MEMORY to dev_mp.c.
		Adding a 'put' command to the debugger.
		ARCBIOS function 0x100 used by IRIX seems to _NOT_ be a
		ReturnFromMain(), but something else undocumented.
		The count and compare registers are now 32-bit in all
		places, as they should be. (This causes, among other things,
		OpenBSD/sgi to not hang randomly in userspace anymore.)
		On breakpoints, the debugger is now entered _at_ the
		instruction at the breakpoint, not after it.
		Some cursor keys now work when inputed via X.
		Refreshing the MC146818 device a bit more.
20050118	Trying to add some support for less-than-4KB virtual pages,
		used by at least VR4131. Thanks to Alexander Yurchenko for
		noticing this. (I'm assuming for now that all R41xx work
		this way, which is not necessarily true.) It doesn't really
		work yet though.
		Renicing the "loading files" messages and other things
		displayed during startup.
		Changing the disassembly output of ori, xori, and andi to
		unsigned hex immediate, instead of decimal (as suggested
		by Alec Voropay).
		configure-script update for HP-UX, and switching from using
		inet_aton() to inet_pton() (as suggested by Nils Weller).
		Also adding -lnsl on Solaris, if required by inet_pton().
		Lots of minor R4100-related updates.
20050119	Correcting the R4100 config register in src/coproc.c, and
		a minor update to dev_vr41xx.
		Finally began a redesign/remodelling/cleanup that I have had
		in mind for quite some time... moving many things that were
		in struct emul into a new struct machine.
		Userland emulation now works with bintrans.
		Refreshing the LANCE controller (dev_le.c).
		Fixing the LK201 keyboard id.
20050120	Continuing on the remodelling/cleanup.
		Fixing the SCSI bug (which was triggered sometimes by
		NetBSD 2.0/pmax on Linux/i386 hosts).
		Adding a speed-limit hack to the mc146818 device when running
		in DECstation mode (limiting to emulated 30 MHz clock, so
		that Ultrix doesn't freak out).
		Adding an ugly workaround for the floating-point bug which
		is triggered when running NetBSD/pmax 2.0 on an Alpha host.
		The count/compare interrupt will not be triggered now, if
		the compare register is left untouched.
		Many, many other fixes...
20050121	Continuing the remodelling/cleanup. (Mostly working on the
		network stack, and on moving towards multiple emulations
		with multiple machines per emulation.)
		Fixbug: not clearing lowest parts of lo0 and hi on tlbr
		(seems to increase performance when emulating Linux?).
20050122	Continuing the remodelling/cleanup.
		Linux on DECstation uses a non-used part of the RTC registers
		for the year value; this is supported now, so Linux thinks
		it is 2005 and not 2000.
		Began hacking on something to reply to Debian's DHCP requests,
		but it's not working yet.
20050123	Continuing the remodelling/cleanup.
20050124	Continuing the remodelling/cleanup.
		Converting the dev_vga charcell memory to support direct
		bintrans access (similar to how dev_fb works), and fixing a
		couple of bintrans bugs in the process.
		The emulator now compiles under OpenBSD/arc 2.3 without
		crashing (mostly due to the bintrans fixes, but also some
		minor updates to the configure script).
20050125	Continuing the remodelling/cleanup.
		The '-a' option was missing in the Hello World example in the
		documentation. (Thanks to Soohyun Cho for noticing this.)
20050126	Continuing the remodelling/cleanup. Moving around stuff in
		the header files, etc. Adding a '-K' command line option, which
		forces the debugger to be entered at the exit of a simulation,
		regardless of failure or success. Beginning to work on the
		config file parser.
		Splitting doc/index.html into experiments.html, guestoses.html,
		intro.html, and misc.html.
		Updating the man page and adding a skeleton section about the
		configure files to doc/misc.html.
20050127	Minor documentation updates.
20050128	Continuing the remodelling/cleanup, mostly working on the
		config file parser (adding a couple of machine words, enough
		to run simple emulations, and adding support for multi-line
		comments using tuborgs).
		Removing some command line options for the least working
		emulation modes (-e, -f, -g, -E, -H), adding new -E and -e
		options for selecting machine type.
		Moving global variables from src/x11.c into struct machine (a
		bit buggy, but it seems to almost work).
20050129	Removing the Playstation 2 mode (-B) and hpcmips modes (-F)
		from the command line as well.
		Changing the -T command line option from meaning "trace on bad
		address" to meaning "enter the single-step debugger on bad
		address".
		More updates to the configuration file parser (nested tuborg
		comments, more options, ...).
		Making -s a global setting, not just affecting one machine.
		Trying to fix the X11 event stuff... but it's so ugly that it
		must be rewritten later.
		Continuing the multi-emul cleanup.
		Bugfixes and other updates to dev_vga.
20050130	Continuing the remodelling/cleanup. Finally moving out the
		MIPS dependant stuff of the cpu struct into its own struct.
		Renaming cpu.c to cpu_mips.c, and cpu_common.c to cpu.c.
		Adding a dummy cpu_ppc.c.
		Removing the UltraSPARC bintrans backend.
		Many other minor updates.
		src/file.c should now be free from MIPS-dependancies.
20050131	Continuing a little bit more on src/file.c. PPC ELFs can now
		be loaded, it seems.
		Continuing on src/cpu_ppc.c.
		'mips' is undefined by the configure script, if it is defined
		by default. (Fixes build on at least OpenBSD/arc and
		NetBSD/arc, where gcc defines 'mips'.)
		A couple of other minor fixes.
		Removing the "Changing framebuffer resolution" section from
		doc/misc.h (because it's buggy and not very useful anway).
		Adding a mystrtoull(), used on systems where there is no
		strtoull() in libc.
		Adding 'add_x11_display' to the configure file parser 
		(corresponding to the -z command line option).
		Continuing the multi-emul machine cleanup.
20050201	Minor updates (man page, RELEASE, README).
		Continuing the cleanup.
		Adding a 'name' field to the emul struct, and adding a command
		to the debugger ("focus") to make it possible to switch focus
		to different machines (in different emuls).
		Beginning to work on the PPC disassembler etc. Hello World
		for linux-ppc64 can be disassembled :-)
20050202	Adding a hack for reading symbols from Microsoft's variant of
		COFF files.
		Adding a dummy cpu_sparc.c and include/cpu_sparc.h.
		Cleaning up more to support multiple cpu families.
		Various other minor updates.
		Fixing another old-gcc-on-Alpha problem.
20050203	Bintrans cache size is now variable, settable by a new
		configuration file option 'bintrans_size'.
		The debugger can now theoretically call disassembler functions
		for cpu families with non-fixed instruction word length.
		Working more on the mec controller. It now works well enough
		to let both NetBSD/sgimips and OpenBSD/sgi connect to the
		outside world using ftp :-)
		Continuing on the cleanup of the networking subsystem.
20050204	Continuing the cleanup.
		Working on a way to use separate xterms for serial ports and
		other console input, when emulating multiple machines (or one
		machine with multiple serial lines active).
20050205	Minor documentation updates.
20050206	Moving console.c from devices/ to src/, and continuing the
		work on using separate windows for each serial console.
		Trying to get OpenBSD/sgi to boot with root-on-nfs on an
		emulated NetBSD/pmax server, but no success in setting up
		the server yet.
20050207	Continuing on the console cleanup.
		Adding a 'start_paused' configuration file option, and a
		'pause' command to the debugger.
20050208	Everything now builds with --withoutmips.
		Continuing on the documentation on how to run OpenBSD/sgi, but
		no actual success yet.
		sizeof => (int)sizeof in the configure script (as suggested by
		Nils Weller).
20050209	Adding a check for -lm to the configure script.
		Continuing on the cleanup: trying to make memory_rw non-MIPS
		dependant.
		Trying to make a better fix for the cdrom-block-size problems
		on FreeBSD. (It now works with a Windows NT 4.0 cdrom in my
		drive.)
		Began a clean-up of the userland subsystem.
20050210	Continuing the userland cleanup.
		IBM's Hello World example for Linux/PPC64 runs fine now.
20050211	Continuing the cleanup. Removing the --userland configure
		option (because support for userland is always included now).
		Working more on getting OpenBSD/sgi to boot with root on
		nfs. (Booting with the ramdisk kernel, and mounting root via
		nfs works, but not yet from the generic kernel.)
		Major update to the manpage.
		Removing the -G command line option (SGI modes).
20050212	Updating the documentation (experimental devices: dev_cons
		and dev_mp, better hello.c, and some other things).
20050213	Some minor fixes: documentation, 80 columns in some source
		files, better configure script options.
		Adding some more PPC instructions.
		Added a NOFPU flag to the MIPS cpu flags, so that executing
		FPU instructions on for example VR4xxx will fail (as suggested
		by Alexander Yurchenko).
20050214	Implementing more PPC instructions.
		Adding dev_pmppc.
20050215	Continuing the work on PPC emulation. Adding a (mostly non-
		working) NetBSD/powerpc userland mode, a (buggy)
		show_trace_tree thing (simliar to the MIPS version).
20050216	Continuing...
20050218	Continuing the clean-up. (Merging the devices and devstate
		debugger commands, more 80-column cleanup, some documentation
		updates, ...).
20050219	Removing the -D, -A, and -a command line options. Updating the
		documentation, in preparation for the next release.
		Adding interrupt stuff to dev_cons.
		Single-stepping now looks/works better with bintrans enabled.
		Beginning the first phase of release testing; various minor
		updates to make everything build cleanly on Solaris.
20050220	Continuing testing for the release...
                
==============  RELEASE 0.3  ==============

20050221	Minor updates. Some more clean-up.
		Beginning on the new device registry stuff.
20050222	Continuing on the device stuff, and doing various other kinds
		of clean-up.
		Adding a dummy BeBox mode.
		Making the pc register common for all cpu families.
		Adding some more PPC instructions and fixing some bugs.
20050223	Continuing on the BeBox stuff, and adding more instructions.
		Adding an ns16550 to the VR4131 emulation (which is probably
		a close enough fake to the VR4131's SIU unit).
20050224	Minor updates. Adding dummy PReP, macppc, and DB64360 modes.
		Continuing on the device registry rewrite.
20050225	Continuing on the device stuff.
20050226	Continuing more on the device rewrite.
		Separating the "testmips" machine into testmips and baremips
		(and similarly with the ppc machine).
		Redesigning the device registry again :-)
		Adding a "device" command to the config file parser.
		Adding "device add" and "device remove" to the debugger.
		Removing pcidevs.h, because it was almost unused.
20050228	Correcting the Sprite disk image url in the documentation.
20050301	Adding an URISC cpu emulation mode (single-opcode machine).
20050303	Adding some files to the experiments directory (rssb_as.c,
		rssb_as.README, urisc_test.s).
		Continuing on the device stuff.
20050304	Minor documentation update. Also, the SPARC, PPC, and URISC
		modes are now enabled by default in the configure script.
		Some minor PPC updates (adding a VGA device to the bebox
		emulation mode).
20050305	Moving the static i386 bintrans runchunk code snippet (and the
		others) to be dynamically generated. (This allows the code to
		compile on i386 with old gcc.)
		Loading PPC64 ELFs now sets R2 to the TOC base.
		Changing the name of the emulator from mips64emul to GXemul.
		Splitting out the configuration file part of the documentation
		into its own file (configfiles.html).
20050306	Some really minor documentation updates.
		Adding a -D command line option (for "fully deterministic"
		behaviour).
20050308	Minor PPC updates. Adding a dummy OpenFirmware emulation layer.
20050309	Adding a hack for systems without inet_pton (such as Cygwin in
		Windows) as suggested by Soohyun Cho. (And updating the
		configure script too.)
		Adding a dummy HPPA cpu family.
		Some more OpenFirmware updates.
		Faster loading of badly aligned ELF regions.
20050311	Minor updates. Adding a dummy "NEC MobilePro 780" hpcmips
		machine mode; disabling direct bintrans access to framebuffers
		that are not 4K page aligned.
20050312	Adding an ugly KIU hack to the VR41xx device (which enables
		NetBSD/hpcmips MobilePro 780 keyboard input).
20050313	Adding a dummy "pcic" device (a pcmcia card controller).
		Adding a dummy Alpha cpu emulation mode.
		Fixing a strcmp length bug (thanks to Alexander Yurchenko for
		noticing the bug).
20050314	Some minor bintrans-related updates in preparation for a new
		bintrans subsystem: command line option -b now means "old
		bintrans", -B means "disable bintrans", and using no option at
		all selects "new bintrans".
		Better generation of MAC addresses when emulating multiple
		machines and/or NICs.
		Minor documentation updates (regarding configuration files).
20050315	Adding dummy standby, suspend, and hibernate MIPS opcodes.
		RTC interrupt hack for VR4121 (hpcmips).
		Enough of the pcic is now emulated to let NetBSD/hpcmips detect
		a PCMCIA harddisk controller card (but there is no support for
		ISA/PCMCIA interrupts yet).
		Adding preliminary instructions on how to install
		NetBSD/hpcmips.
		Continuing the attempt to get harddisks working with interrupts
		(pcic, wdc on hpcmips).
20050318	Minor updates. (Fixing disassembly of MIPS bgtz etc., 
		continuing on the device cleanup, ...)
20050319	Minor updates.
20050320	Minor updates.
20050322	Various minor updates.
20050323	Some more minor updates.
20050328	VR41xx-related updates (keyboard stuff: the space key and
		shifted and ctrled keys are now working in userland (ie
		NetBSD/hpcmips' ramdisk installer).
		Also adding simple cursor key support to the VR41xx kiu.
20050329	Some progress on the wdc.
		Updating the documentation of how to (possibly) install
		NetBSD/hpcmips, once it is working.
		Adding delays before wdc interrupts; this allows NetBSD
		2.0/hpcmips to be successfully installed!
		Mirroring physical addresses 0x8....... to 0x00000000 on
		hpcmips; this makes it possible to run X11 inside
		NetBSD/hpcmips :-)
		Updating the documentation regarding NetBSD/hpcmips.
		Fixing 16-bit vs 15-bit color in dev_fb.
20050330	Print a warning when the user attempts to load a gzipped
		file. (Thanks to Juan RP for making me aware of this "bug".)
20050331	Importing aic7xxx_reg.h from NetBSD.
		Adding a "-x" command line option, which forces xterms for
		each emulated serial port to always be opened.
		Adding a MobilePro 770 mode (same as 780, but different
		framebuffer address which allows bintrans = fast scrolling),
		and a MobilePro 800 (with 800x600 pixels framebuffer :-).
20050401	Minor updates.
20050402	Minor updates. (The standby and suspend instructions are
		bintransed as NOPs, and some minor documentation updates.)
20050403	Adding an Agenda VR3 mode, and playing around with a Linux
		kernel image, but not much success yet.
		Changing BIFB_D16_FFFF -> BIFB_D16_0000 for the hpcmips 
		framebuffers, causing NetBSD to boot with correct colors.
		New syntax for loading raw files: loadaddr:skiplen:
		initialpc:filename. (This is necessary to boot the Linux VR3
		kernels.)
		The Linux VR3 kernel boots in both serial console mode and
		using the framebuffer, but it panics relatively early.
20050404	Continuing on the AHC, and some other minor updates.
20050405	Adding a note in doc/experimental.html about "root1.2.6.cramfs"
		(thanks to Alec Voropay for noticing that it wasn't part
		of root1.2.6.kernel-8.00).
		Also adding a note about another cramfs image.
		-o options are now added to the command line passed to the
		Linux kernel, when emulating the VR3.
		Adding a MobilePro 880 mode, and a dummy IBM WorkPad Z50 mode.
20050406	Connecting the VR3 serial controller to irq 9 (Linux calls this
		irq 17), and some other interrupt-related cleanups.
		Reducing the memory overhead per bintranslated page. (Hopefully
		this makes things faster, or at least not slower...)
20050407	Some more cleanup regarding command line argument passing for
		the hpcmips modes.
		Playing with Linux kernels for MobilePro 770 and 800; they get
		as far as mounting a root filesystem, but then crash.
		Doing some testing for the next release.

==============  RELEASE 0.3.1  ==============


1 /*
2 * Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
3 *
4 * Redistribution and use in source and binary forms, with or without
5 * modification, are permitted provided that the following conditions are met:
6 *
7 * 1. Redistributions of source code must retain the above copyright
8 * notice, this list of conditions and the following disclaimer.
9 * 2. Redistributions in binary form must reproduce the above copyright
10 * notice, this list of conditions and the following disclaimer in the
11 * documentation and/or other materials provided with the distribution.
12 * 3. The name of the author may not be used to endorse or promote products
13 * derived from this software without specific prior written permission.
14 *
15 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
16 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18 * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25 * SUCH DAMAGE.
26 *
27 *
28 * $Id: bintrans_i386.c,v 1.75 2005/03/22 09:12:04 debug Exp $
29 *
30 * i386 specific code for dynamic binary translation.
31 * See bintrans.c for more information. Included from bintrans.c.
32 *
33 * Translated code uses the following conventions at all time:
34 *
35 * esi points to the cpu struct
36 * edi lowest 32 bits of cpu->pc
37 * ebp contains cpu->bintrans_instructions_executed
38 */
39
40
41 struct cpu dummy_cpu;
42 struct mips_coproc dummy_coproc;
43 struct vth32_table dummy_vth32_table;
44
45
46 /*
47 * bintrans_host_cacheinvalidate()
48 *
49 * Invalidate the host's instruction cache. On i386, this isn't necessary,
50 * so this is an empty function.
51 */
52 static void bintrans_host_cacheinvalidate(unsigned char *p, size_t len)
53 {
54 /* Do nothing. */
55 }
56
57
58 /* offsetof (in stdarg.h) could possibly be used, but I'm not sure
59 if it will take care of the compiler problems... */
60
61 #define ofs_i (((size_t)&dummy_cpu.cd.mips.bintrans_instructions_executed) - ((size_t)&dummy_cpu))
62 #define ofs_pc (((size_t)&dummy_cpu.pc) - ((size_t)&dummy_cpu))
63 #define ofs_pc_last (((size_t)&dummy_cpu.cd.mips.pc_last) - ((size_t)&dummy_cpu))
64 #define ofs_tabl0 (((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - ((size_t)&dummy_cpu))
65 #define ofs_chunks ((size_t)&dummy_vth32_table.bintrans_chunks[0] - (size_t)&dummy_vth32_table)
66 #define ofs_chunkbase ((size_t)&dummy_cpu.cd.mips.chunk_base_address - (size_t)&dummy_cpu)
67
68
69 static void (*bintrans_runchunk)(struct cpu *, unsigned char *);
70 static void (*bintrans_jump_to_32bit_pc)(struct cpu *);
71 static void (*bintrans_loadstore_32bit)(struct cpu *);
72
73
74 /*
75 * bintrans_write_quickjump():
76 */
77 static void bintrans_write_quickjump(struct memory *mem,
78 unsigned char *quickjump_code, uint32_t chunkoffset)
79 {
80 uint32_t i386_addr;
81 unsigned char *a = quickjump_code;
82
83 i386_addr = chunkoffset + (size_t)mem->translation_code_chunk_space;
84 i386_addr = i386_addr - ((size_t)a + 5);
85
86 /* printf("chunkoffset=%i, %08x %08x %i\n",
87 chunkoffset, i386_addr, a, ofs); */
88
89 *a++ = 0xe9;
90 *a++ = i386_addr;
91 *a++ = i386_addr >> 8;
92 *a++ = i386_addr >> 16;
93 *a++ = i386_addr >> 24;
94 }
95
96
97 /*
98 * bintrans_write_chunkreturn():
99 */
100 static void bintrans_write_chunkreturn(unsigned char **addrp)
101 {
102 unsigned char *a = *addrp;
103 *a++ = 0xc3; /* ret */
104 *addrp = a;
105 }
106
107
108 /*
109 * bintrans_write_chunkreturn_fail():
110 */
111 static void bintrans_write_chunkreturn_fail(unsigned char **addrp)
112 {
113 unsigned char *a = *addrp;
114
115 /* 81 cd 00 00 00 01 orl $0x1000000,%ebp */
116 *a++ = 0x81; *a++ = 0xcd;
117 *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x01; /* TODO: not hardcoded */
118
119 *a++ = 0xc3; /* ret */
120 *addrp = a;
121 }
122
123
124 /*
125 * bintrans_write_pc_inc():
126 */
127 static void bintrans_write_pc_inc(unsigned char **addrp)
128 {
129 unsigned char *a = *addrp;
130
131 /* 83 c7 04 add $0x4,%edi */
132 *a++ = 0x83; *a++ = 0xc7; *a++ = 4;
133
134 #if 0
135 if (!bintrans_32bit_only) {
136 int ofs;
137 /* 83 96 zz zz zz zz 00 adcl $0x0,zz(%esi) */
138 ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
139 ofs += 4;
140 *a++ = 0x83; *a++ = 0x96;
141 *a++ = ofs & 255;
142 *a++ = (ofs >> 8) & 255;
143 *a++ = (ofs >> 16) & 255;
144 *a++ = (ofs >> 24) & 255;
145 *a++ = 0;
146 }
147 #endif
148
149 /* 45 inc %ebp */
150 *a++ = 0x45;
151
152 *addrp = a;
153 }
154
155
156 /*
157 * load_pc_into_eax_edx():
158 */
159 static void load_pc_into_eax_edx(unsigned char **addrp)
160 {
161 unsigned char *a;
162 a = *addrp;
163
164 /* 89 f8 mov %edi,%eax */
165 *a++ = 0x89; *a++ = 0xf8;
166
167 #if 0
168 if (bintrans_32bit_only) {
169 /* 99 cltd */
170 *a++ = 0x99;
171 } else
172 #endif
173 {
174 int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
175 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
176 ofs += 4;
177 *a++ = 0x8b; *a++ = 0x96;
178 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
179 }
180
181 *addrp = a;
182 }
183
184
185 /*
186 * store_eax_edx_into_pc():
187 */
188 static void store_eax_edx_into_pc(unsigned char **addrp)
189 {
190 unsigned char *a;
191 int ofs = ((size_t)&dummy_cpu.pc) - (size_t)&dummy_cpu;
192 a = *addrp;
193
194 /* 89 c7 mov %eax,%edi */
195 *a++ = 0x89; *a++ = 0xc7;
196
197 /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
198 ofs += 4;
199 *a++ = 0x89; *a++ = 0x96;
200 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
201
202 *addrp = a;
203 }
204
205
206 /*
207 * load_into_eax_edx():
208 *
209 * Usage: load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
210 */
211 static void load_into_eax_edx(unsigned char **addrp, void *p)
212 {
213 unsigned char *a;
214 int ofs = (size_t)p - (size_t)&dummy_cpu;
215 a = *addrp;
216
217 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
218 *a++ = 0x8b; *a++ = 0x86;
219 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
220
221 #if 0
222 if (bintrans_32bit_only) {
223 /* 99 cltd */
224 *a++ = 0x99;
225 } else
226 #endif
227 {
228 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
229 ofs += 4;
230 *a++ = 0x8b; *a++ = 0x96;
231 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
232 }
233
234 *addrp = a;
235 }
236
237
238 /*
239 * load_into_eax_and_sign_extend_into_edx():
240 *
241 * Usage: load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
242 */
243 static void load_into_eax_and_sign_extend_into_edx(unsigned char **addrp, void *p)
244 {
245 unsigned char *a;
246 int ofs = (size_t)p - (size_t)&dummy_cpu;
247 a = *addrp;
248
249 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
250 *a++ = 0x8b; *a++ = 0x86;
251 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
252
253 /* 99 cltd */
254 *a++ = 0x99;
255
256 *addrp = a;
257 }
258
259
260 /*
261 * load_into_eax_dont_care_about_edx():
262 *
263 * Usage: load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
264 */
265 static void load_into_eax_dont_care_about_edx(unsigned char **addrp, void *p)
266 {
267 unsigned char *a;
268 int ofs = (size_t)p - (size_t)&dummy_cpu;
269 a = *addrp;
270
271 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
272 *a++ = 0x8b; *a++ = 0x86;
273 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
274
275 *addrp = a;
276 }
277
278
279 /*
280 * store_eax_edx():
281 *
282 * Usage: store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]); etc.
283 */
284 static void store_eax_edx(unsigned char **addrp, void *p)
285 {
286 unsigned char *a;
287 int ofs = (size_t)p - (size_t)&dummy_cpu;
288 a = *addrp;
289
290 /* 89 86 38 30 00 00 mov %eax,0x3038(%esi) */
291 *a++ = 0x89; *a++ = 0x86;
292 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
293
294 /* 89 96 3c 30 00 00 mov %edx,0x303c(%esi) */
295 ofs += 4;
296 *a++ = 0x89; *a++ = 0x96;
297 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
298
299 *addrp = a;
300 }
301
302
303 /*
304 * bintrans_write_instruction__lui():
305 */
306 static int bintrans_write_instruction__lui(unsigned char **addrp, int rt, int imm)
307 {
308 unsigned char *a;
309
310 a = *addrp;
311 if (rt == 0)
312 goto rt0;
313
314 /* b8 00 00 dc fe mov $0xfedc0000,%eax */
315 *a++ = 0xb8; *a++ = 0; *a++ = 0;
316 *a++ = imm & 255; *a++ = imm >> 8;
317
318 /* 99 cltd */
319 *a++ = 0x99;
320
321 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
322 *addrp = a;
323
324 rt0:
325 bintrans_write_pc_inc(addrp);
326 return 1;
327 }
328
329
330 /*
331 * bintrans_write_instruction__jr():
332 */
333 static int bintrans_write_instruction__jr(unsigned char **addrp,
334 int rs, int rd, int special)
335 {
336 unsigned char *a;
337 int ofs;
338
339 a = *addrp;
340
341 /*
342 * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
343 * and cpu->delay_jmpaddr = gpr[rs].
344 */
345
346 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
347 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
348 *a++ = 0xc7; *a++ = 0x86;
349 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
350 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
351
352 #if 0
353 if (bintrans_32bit_only)
354 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
355 else
356 #endif
357 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
358
359 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
360
361 if (special == SPECIAL_JALR && rd != 0) {
362 /* gpr[rd] = retaddr (pc + 8) */
363
364 #if 0
365 if (bintrans_32bit_only) {
366 load_pc_into_eax_edx(&a);
367 /* 83 c0 08 add $0x8,%eax */
368 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
369 } else
370 #endif
371 {
372 load_pc_into_eax_edx(&a);
373 /* 83 c0 08 add $0x8,%eax */
374 /* 83 d2 00 adc $0x0,%edx */
375 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
376 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
377 }
378
379 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
380 }
381
382 *addrp = a;
383 bintrans_write_pc_inc(addrp);
384 return 1;
385 }
386
387
388 /*
389 * bintrans_write_instruction__mfmthilo():
390 */
391 static int bintrans_write_instruction__mfmthilo(unsigned char **addrp,
392 int rd, int from_flag, int hi_flag)
393 {
394 unsigned char *a;
395
396 a = *addrp;
397
398 if (from_flag) {
399 if (rd != 0) {
400 /* mfhi or mflo */
401 if (hi_flag)
402 load_into_eax_edx(&a, &dummy_cpu.cd.mips.hi);
403 else
404 load_into_eax_edx(&a, &dummy_cpu.cd.mips.lo);
405 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
406 }
407 } else {
408 /* mthi or mtlo */
409 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
410 if (hi_flag)
411 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
412 else
413 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
414 }
415
416 *addrp = a;
417 bintrans_write_pc_inc(addrp);
418 return 1;
419 }
420
421
422 /*
423 * bintrans_write_instruction__addiu_etc():
424 */
425 static int bintrans_write_instruction__addiu_etc(unsigned char **addrp,
426 int rt, int rs, int imm, int instruction_type)
427 {
428 unsigned char *a;
429 unsigned int uimm;
430
431 /* TODO: overflow detection for ADDI and DADDI */
432 switch (instruction_type) {
433 case HI6_ADDI:
434 case HI6_DADDI:
435 return 0;
436 }
437
438 a = *addrp;
439
440 if (rt == 0)
441 goto rt0;
442
443 uimm = imm & 0xffff;
444
445 if (uimm == 0 && (instruction_type == HI6_ADDIU ||
446 instruction_type == HI6_ADDI)) {
447 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
448 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
449 goto rt0;
450 }
451
452 if (uimm == 0 && (instruction_type == HI6_DADDIU ||
453 instruction_type == HI6_DADDI || instruction_type == HI6_ORI)) {
454 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
455 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
456 goto rt0;
457 }
458
459 #if 0
460 if (bintrans_32bit_only)
461 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
462 else
463 #endif
464 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
465
466 switch (instruction_type) {
467 case HI6_ADDIU:
468 case HI6_DADDIU:
469 case HI6_ADDI:
470 case HI6_DADDI:
471 if (imm & 0x8000) {
472 /* 05 39 fd ff ff add $0xfffffd39,%eax */
473 /* 83 d2 ff adc $0xffffffff,%edx */
474 *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0xff; *a++ = 0xff;
475 if (instruction_type == HI6_DADDIU) {
476 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
477 }
478 } else {
479 /* 05 c7 02 00 00 add $0x2c7,%eax */
480 /* 83 d2 00 adc $0x0,%edx */
481 *a++ = 0x05; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
482 if (instruction_type == HI6_DADDIU) {
483 *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
484 }
485 }
486 if (instruction_type == HI6_ADDIU) {
487 /* 99 cltd */
488 *a++ = 0x99;
489 }
490 break;
491 case HI6_ANDI:
492 /* 25 34 12 00 00 and $0x1234,%eax */
493 /* 31 d2 xor %edx,%edx */
494 *a++ = 0x25; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
495 *a++ = 0x31; *a++ = 0xd2;
496 break;
497 case HI6_ORI:
498 /* 0d 34 12 00 00 or $0x1234,%eax */
499 *a++ = 0xd; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
500 break;
501 case HI6_XORI:
502 /* 35 34 12 00 00 xor $0x1234,%eax */
503 *a++ = 0x35; *a++ = uimm; *a++ = uimm >> 8; *a++ = 0; *a++ = 0;
504 break;
505 case HI6_SLTIU:
506 /* set if less than, unsigned. (compare edx:eax to ecx:ebx) */
507 /* ecx:ebx = the immediate value */
508 /* bb dc fe ff ff mov $0xfffffedc,%ebx */
509 /* b9 ff ff ff ff mov $0xffffffff,%ecx */
510 /* or */
511 /* 29 c9 sub %ecx,%ecx */
512 #if 0
513 if (bintrans_32bit_only) {
514 /* 99 cltd */
515 *a++ = 0x99;
516 }
517 #endif
518 *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
519 if (uimm & 0x8000) {
520 *a++ = 0xff; *a++ = 0xff;
521 *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
522 } else {
523 *a++ = 0; *a++ = 0;
524 *a++ = 0x29; *a++ = 0xc9;
525 }
526
527 /* if edx <= ecx and eax < ebx then 1, else 0. */
528 /* 39 ca cmp %ecx,%edx */
529 /* 77 0b ja <ret0> */
530 /* 39 d8 cmp %ebx,%eax */
531 /* 73 07 jae 58 <ret0> */
532 *a++ = 0x39; *a++ = 0xca;
533 *a++ = 0x77; *a++ = 0x0b;
534 *a++ = 0x39; *a++ = 0xd8;
535 *a++ = 0x73; *a++ = 0x07;
536
537 /* b8 01 00 00 00 mov $0x1,%eax */
538 /* eb 02 jmp <common> */
539 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
540 *a++ = 0xeb; *a++ = 0x02;
541
542 /* ret0: */
543 /* 29 c0 sub %eax,%eax */
544 *a++ = 0x29; *a++ = 0xc0;
545
546 /* common: */
547 /* 99 cltd */
548 *a++ = 0x99;
549 break;
550 case HI6_SLTI:
551 /* set if less than, signed. (compare edx:eax to ecx:ebx) */
552 /* ecx:ebx = the immediate value */
553 /* bb dc fe ff ff mov $0xfffffedc,%ebx */
554 /* b9 ff ff ff ff mov $0xffffffff,%ecx */
555 /* or */
556 /* 29 c9 sub %ecx,%ecx */
557 #if 0
558 if (bintrans_32bit_only) {
559 /* 99 cltd */
560 *a++ = 0x99;
561 }
562 #endif
563 *a++ = 0xbb; *a++ = uimm; *a++ = uimm >> 8;
564 if (uimm & 0x8000) {
565 *a++ = 0xff; *a++ = 0xff;
566 *a++ = 0xb9; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
567 } else {
568 *a++ = 0; *a++ = 0;
569 *a++ = 0x29; *a++ = 0xc9;
570 }
571
572 /* if edx > ecx then 0. */
573 /* if edx < ecx then 1. */
574 /* if eax < ebx then 1, else 0. */
575 /* 39 ca cmp %ecx,%edx */
576 /* 7c 0a jl <ret1> */
577 /* 7f 04 jg <ret0> */
578 /* 39 d8 cmp %ebx,%eax */
579 /* 7c 04 jl <ret1> */
580 *a++ = 0x39; *a++ = 0xca;
581 *a++ = 0x7c; *a++ = 0x0a;
582 *a++ = 0x7f; *a++ = 0x04;
583 *a++ = 0x39; *a++ = 0xd8;
584 *a++ = 0x7c; *a++ = 0x04;
585
586 /* ret0: */
587 /* 29 c0 sub %eax,%eax */
588 /* eb 05 jmp <common> */
589 *a++ = 0x29; *a++ = 0xc0;
590 *a++ = 0xeb; *a++ = 0x05;
591
592 /* ret1: */
593 /* b8 01 00 00 00 mov $0x1,%eax */
594 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
595
596 /* common: */
597 /* 99 cltd */
598 *a++ = 0x99;
599 break;
600 }
601
602 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
603
604 rt0:
605 *addrp = a;
606 bintrans_write_pc_inc(addrp);
607 return 1;
608 }
609
610
611 /*
612 * bintrans_write_instruction__jal():
613 */
614 static int bintrans_write_instruction__jal(unsigned char **addrp,
615 int imm, int link)
616 {
617 unsigned char *a;
618 uint32_t subimm;
619 int ofs;
620
621 a = *addrp;
622
623 load_pc_into_eax_edx(&a);
624
625 if (link) {
626 /* gpr[31] = pc + 8 */
627 #if 0
628 if (bintrans_32bit_only) {
629 /* 50 push %eax */
630 /* 83 c0 08 add $0x8,%eax */
631 *a++ = 0x50;
632 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
633 } else
634 #endif
635 {
636 /* 50 push %eax */
637 /* 52 push %edx */
638 /* 83 c0 08 add $0x8,%eax */
639 /* 83 d2 00 adc $0x0,%edx */
640 *a++ = 0x50;
641 *a++ = 0x52;
642 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x08;
643 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
644 }
645 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[31]);
646 #if 0
647 if (bintrans_32bit_only) {
648 /* 58 pop %eax */
649 *a++ = 0x58;
650 } else
651 #endif
652 {
653 /* 5a pop %edx */
654 /* 58 pop %eax */
655 *a++ = 0x5a;
656 *a++ = 0x58;
657 }
658 }
659
660 /* delay_jmpaddr = top 36 bits of pc together with lowest 28 bits of imm*4: */
661 imm *= 4;
662
663 /* Add 4, because the jump is from the delay slot: */
664 /* 83 c0 04 add $0x4,%eax */
665 /* 83 d2 00 adc $0x0,%edx */
666 *a++ = 0x83; *a++ = 0xc0; *a++ = 0x04;
667 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
668
669 /* c1 e8 1c shr $0x1c,%eax */
670 /* c1 e0 1c shl $0x1c,%eax */
671 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x1c;
672 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x1c;
673
674 subimm = imm;
675 subimm &= 0x0fffffff;
676
677 /* 0d 78 56 34 12 or $0x12345678,%eax */
678 *a++ = 0x0d; *a++ = subimm; *a++ = subimm >> 8;
679 *a++ = subimm >> 16; *a++ = subimm >> 24;
680
681 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
682
683 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
684 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
685 *a++ = 0xc7; *a++ = 0x86;
686 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
687 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
688
689 *addrp = a;
690 bintrans_write_pc_inc(addrp);
691 return 1;
692 }
693
694
695 /*
696 * bintrans_write_instruction__addu_etc():
697 */
698 static int bintrans_write_instruction__addu_etc(unsigned char **addrp,
699 int rd, int rs, int rt, int sa, int instruction_type)
700 {
701 unsigned char *a;
702 int load64 = 0, do_store = 1;
703
704 /* TODO: Not yet */
705 switch (instruction_type) {
706 case SPECIAL_MULT:
707 case SPECIAL_MULTU:
708 case SPECIAL_DIV:
709 case SPECIAL_DIVU:
710 if (rd != 0)
711 return 0;
712 break;
713 case SPECIAL_DSLL:
714 case SPECIAL_DSLL32:
715 case SPECIAL_DSRA:
716 case SPECIAL_DSRA32:
717 case SPECIAL_DSRL:
718 case SPECIAL_DSRL32:
719 case SPECIAL_MOVZ:
720 case SPECIAL_MOVN:
721 bintrans_write_chunkreturn_fail(addrp);
722 return 0;
723 }
724
725 switch (instruction_type) {
726 case SPECIAL_DADDU:
727 case SPECIAL_DSUBU:
728 case SPECIAL_OR:
729 case SPECIAL_AND:
730 case SPECIAL_NOR:
731 case SPECIAL_XOR:
732 case SPECIAL_DSLL:
733 case SPECIAL_DSRL:
734 case SPECIAL_DSRA:
735 case SPECIAL_DSLL32:
736 case SPECIAL_DSRL32:
737 case SPECIAL_DSRA32:
738 case SPECIAL_SLT:
739 case SPECIAL_SLTU:
740 load64 = 1;
741 }
742
743 switch (instruction_type) {
744 case SPECIAL_MULT:
745 case SPECIAL_MULTU:
746 case SPECIAL_DIV:
747 case SPECIAL_DIVU:
748 break;
749 default:
750 if (rd == 0)
751 goto rd0;
752 }
753
754 a = *addrp;
755
756 if ((instruction_type == SPECIAL_ADDU || instruction_type == SPECIAL_DADDU
757 || instruction_type == SPECIAL_OR) && rt == 0) {
758 if (load64)
759 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
760 else
761 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
762 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
763 *addrp = a;
764 goto rd0;
765 }
766
767 /* edx:eax = rs, ecx:ebx = rt */
768 if (load64) {
769 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
770 /* 89 c3 mov %eax,%ebx */
771 /* 89 d1 mov %edx,%ecx */
772 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
773 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
774 } else {
775 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
776 /* 89 c3 mov %eax,%ebx */
777 /* 89 d1 mov %edx,%ecx */
778 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
779 load_into_eax_and_sign_extend_into_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
780 }
781
782 switch (instruction_type) {
783 case SPECIAL_ADDU:
784 /* 01 d8 add %ebx,%eax */
785 /* 99 cltd */
786 *a++ = 0x01; *a++ = 0xd8;
787 *a++ = 0x99;
788 break;
789 case SPECIAL_DADDU:
790 /* 01 d8 add %ebx,%eax */
791 /* 11 ca adc %ecx,%edx */
792 *a++ = 0x01; *a++ = 0xd8;
793 *a++ = 0x11; *a++ = 0xca;
794 break;
795 case SPECIAL_SUBU:
796 /* 29 d8 sub %ebx,%eax */
797 /* 99 cltd */
798 *a++ = 0x29; *a++ = 0xd8;
799 *a++ = 0x99;
800 break;
801 case SPECIAL_DSUBU:
802 /* 29 d8 sub %ebx,%eax */
803 /* 19 ca sbb %ecx,%edx */
804 *a++ = 0x29; *a++ = 0xd8;
805 *a++ = 0x19; *a++ = 0xca;
806 break;
807 case SPECIAL_AND:
808 /* 21 d8 and %ebx,%eax */
809 /* 21 ca and %ecx,%edx */
810 *a++ = 0x21; *a++ = 0xd8;
811 *a++ = 0x21; *a++ = 0xca;
812 break;
813 case SPECIAL_OR:
814 /* 09 d8 or %ebx,%eax */
815 /* 09 ca or %ecx,%edx */
816 *a++ = 0x09; *a++ = 0xd8;
817 *a++ = 0x09; *a++ = 0xca;
818 break;
819 case SPECIAL_NOR:
820 /* 09 d8 or %ebx,%eax */
821 /* 09 ca or %ecx,%edx */
822 /* f7 d0 not %eax */
823 /* f7 d2 not %edx */
824 *a++ = 0x09; *a++ = 0xd8;
825 *a++ = 0x09; *a++ = 0xca;
826 *a++ = 0xf7; *a++ = 0xd0;
827 *a++ = 0xf7; *a++ = 0xd2;
828 break;
829 case SPECIAL_XOR:
830 /* 31 d8 xor %ebx,%eax */
831 /* 31 ca xor %ecx,%edx */
832 *a++ = 0x31; *a++ = 0xd8;
833 *a++ = 0x31; *a++ = 0xca;
834 break;
835 case SPECIAL_SLL:
836 /* 89 d8 mov %ebx,%eax */
837 /* c1 e0 1f shl $0x1f,%eax */
838 /* 99 cltd */
839 *a++ = 0x89; *a++ = 0xd8;
840 if (sa == 1) {
841 *a++ = 0xd1; *a++ = 0xe0;
842 } else {
843 *a++ = 0xc1; *a++ = 0xe0; *a++ = sa;
844 }
845 *a++ = 0x99;
846 break;
847 case SPECIAL_SRA:
848 /* 89 d8 mov %ebx,%eax */
849 /* c1 f8 1f sar $0x1f,%eax */
850 /* 99 cltd */
851 *a++ = 0x89; *a++ = 0xd8;
852 if (sa == 1) {
853 *a++ = 0xd1; *a++ = 0xf8;
854 } else {
855 *a++ = 0xc1; *a++ = 0xf8; *a++ = sa;
856 }
857 *a++ = 0x99;
858 break;
859 case SPECIAL_SRL:
860 /* 89 d8 mov %ebx,%eax */
861 /* c1 e8 1f shr $0x1f,%eax */
862 /* 99 cltd */
863 *a++ = 0x89; *a++ = 0xd8;
864 if (sa == 1) {
865 *a++ = 0xd1; *a++ = 0xe8;
866 } else {
867 *a++ = 0xc1; *a++ = 0xe8; *a++ = sa;
868 }
869 *a++ = 0x99;
870 break;
871 case SPECIAL_SLTU:
872 /* set if less than, unsigned. (compare edx:eax to ecx:ebx) */
873 /* if edx <= ecx and eax < ebx then 1, else 0. */
874 /* 39 ca cmp %ecx,%edx */
875 /* 77 0b ja <ret0> */
876 /* 39 d8 cmp %ebx,%eax */
877 /* 73 07 jae 58 <ret0> */
878 *a++ = 0x39; *a++ = 0xca;
879 *a++ = 0x77; *a++ = 0x0b;
880 *a++ = 0x39; *a++ = 0xd8;
881 *a++ = 0x73; *a++ = 0x07;
882
883 /* b8 01 00 00 00 mov $0x1,%eax */
884 /* eb 02 jmp <common> */
885 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
886 *a++ = 0xeb; *a++ = 0x02;
887
888 /* ret0: */
889 /* 29 c0 sub %eax,%eax */
890 *a++ = 0x29; *a++ = 0xc0;
891
892 /* common: */
893 /* 99 cltd */
894 *a++ = 0x99;
895 break;
896 case SPECIAL_SLT:
897 /* set if less than, signed. (compare edx:eax to ecx:ebx) */
898 /* if edx > ecx then 0. */
899 /* if edx < ecx then 1. */
900 /* if eax < ebx then 1, else 0. */
901 /* 39 ca cmp %ecx,%edx */
902 /* 7c 0a jl <ret1> */
903 /* 7f 04 jg <ret0> */
904 /* 39 d8 cmp %ebx,%eax */
905 /* 7c 04 jl <ret1> */
906 *a++ = 0x39; *a++ = 0xca;
907 *a++ = 0x7c; *a++ = 0x0a;
908 *a++ = 0x7f; *a++ = 0x04;
909 *a++ = 0x39; *a++ = 0xd8;
910 *a++ = 0x7c; *a++ = 0x04;
911
912 /* ret0: */
913 /* 29 c0 sub %eax,%eax */
914 /* eb 05 jmp <common> */
915 *a++ = 0x29; *a++ = 0xc0;
916 *a++ = 0xeb; *a++ = 0x05;
917
918 /* ret1: */
919 /* b8 01 00 00 00 mov $0x1,%eax */
920 *a++ = 0xb8; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
921
922 /* common: */
923 /* 99 cltd */
924 *a++ = 0x99;
925 break;
926 case SPECIAL_SLLV:
927 /* rd = rt << (rs&31) (logical) eax = ebx << (eax&31) */
928 /* xchg ebx,eax, then we can do eax = eax << (ebx&31) */
929 /* 93 xchg %eax,%ebx */
930 /* 89 d9 mov %ebx,%ecx */
931 /* 83 e1 1f and $0x1f,%ecx */
932 /* d3 e0 shl %cl,%eax */
933 *a++ = 0x93;
934 *a++ = 0x89; *a++ = 0xd9;
935 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
936 *a++ = 0xd3; *a++ = 0xe0;
937 /* 99 cltd */
938 *a++ = 0x99;
939 break;
940 case SPECIAL_SRLV:
941 /* rd = rt >> (rs&31) (logical) eax = ebx >> (eax&31) */
942 /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
943 /* 93 xchg %eax,%ebx */
944 /* 89 d9 mov %ebx,%ecx */
945 /* 83 e1 1f and $0x1f,%ecx */
946 /* d3 e8 shr %cl,%eax */
947 *a++ = 0x93;
948 *a++ = 0x89; *a++ = 0xd9;
949 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
950 *a++ = 0xd3; *a++ = 0xe8;
951 /* 99 cltd */
952 *a++ = 0x99;
953 break;
954 case SPECIAL_SRAV:
955 /* rd = rt >> (rs&31) (arithmetic) eax = ebx >> (eax&31) */
956 /* xchg ebx,eax, then we can do eax = eax >> (ebx&31) */
957 /* 93 xchg %eax,%ebx */
958 /* 89 d9 mov %ebx,%ecx */
959 /* 83 e1 1f and $0x1f,%ecx */
960 /* d3 f8 sar %cl,%eax */
961 *a++ = 0x93;
962 *a++ = 0x89; *a++ = 0xd9;
963 *a++ = 0x83; *a++ = 0xe1; *a++ = 0x1f;
964 *a++ = 0xd3; *a++ = 0xf8;
965 /* 99 cltd */
966 *a++ = 0x99;
967 break;
968 case SPECIAL_MULT:
969 case SPECIAL_MULTU:
970 /* 57 push %edi */
971 *a++ = 0x57;
972 if (instruction_type == SPECIAL_MULT) {
973 /* f7 eb imul %ebx */
974 *a++ = 0xf7; *a++ = 0xeb;
975 } else {
976 /* f7 e3 mul %ebx */
977 *a++ = 0xf7; *a++ = 0xe3;
978 }
979 /* here: edx:eax = hi:lo */
980 /* 89 d7 mov %edx,%edi */
981 /* 99 cltd */
982 *a++ = 0x89; *a++ = 0xd7;
983 *a++ = 0x99;
984 /* here: edi=hi, edx:eax = sign-extended lo */
985 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
986 /* 89 f8 mov %edi,%eax */
987 /* 99 cltd */
988 *a++ = 0x89; *a++ = 0xf8;
989 *a++ = 0x99;
990 /* here: edx:eax = sign-extended hi */
991 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
992 /* 5f pop %edi */
993 *a++ = 0x5f;
994 do_store = 0;
995 break;
996 case SPECIAL_DIV:
997 case SPECIAL_DIVU:
998 /*
999 * In: edx:eax = rs, ecx:ebx = rt
1000 * Out: LO = rs / rt, HI = rs % rt
1001 */
1002 /* Division by zero on MIPS is undefined, but on
1003 i386 it causes an exception, so we'll try to
1004 avoid that. */
1005 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00; /* cmp $0x0,%ebx */
1006 *a++ = 0x75; *a++ = 0x01; /* jne skip_inc */
1007 *a++ = 0x43; /* inc %ebx */
1008
1009 /* 57 push %edi */
1010 *a++ = 0x57;
1011 if (instruction_type == SPECIAL_DIV) {
1012 *a++ = 0x99; /* cltd */
1013 *a++ = 0xf7; *a++ = 0xfb; /* idiv %ebx */
1014 } else {
1015 *a++ = 0x29; *a++ = 0xd2; /* sub %edx,%edx */
1016 *a++ = 0xf7; *a++ = 0xf3; /* div %ebx */
1017 }
1018 /* here: edx:eax = hi:lo */
1019 /* 89 d7 mov %edx,%edi */
1020 /* 99 cltd */
1021 *a++ = 0x89; *a++ = 0xd7;
1022 *a++ = 0x99;
1023 /* here: edi=hi, edx:eax = sign-extended lo */
1024 store_eax_edx(&a, &dummy_cpu.cd.mips.lo);
1025 /* 89 f8 mov %edi,%eax */
1026 /* 99 cltd */
1027 *a++ = 0x89; *a++ = 0xf8;
1028 *a++ = 0x99;
1029 /* here: edx:eax = sign-extended hi */
1030 store_eax_edx(&a, &dummy_cpu.cd.mips.hi);
1031 /* 5f pop %edi */
1032 *a++ = 0x5f;
1033 do_store = 0;
1034 break;
1035 #if 0
1036 /* TODO: These are from bintrans_alpha.c. Translate them to i386. */
1037
1038 case SPECIAL_DSLL:
1039 *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1040 break;
1041 case SPECIAL_DSLL32:
1042 sa += 32;
1043 *a++ = 0x21; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sll t1,sa,t0 */
1044 break;
1045 case SPECIAL_DSRA:
1046 *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1047 break;
1048 case SPECIAL_DSRA32:
1049 sa += 32;
1050 *a++ = 0x81; *a++ = 0x17 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48; /* sra t1,sa,t0 */
1051 break;
1052 case SPECIAL_DSRL:
1053 /* Note: bits of sa are distributed among two different bytes. */
1054 *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1055 break;
1056 case SPECIAL_DSRL32:
1057 /* Note: bits of sa are distributed among two different bytes. */
1058 sa += 32;
1059 *a++ = 0x81; *a++ = 0x16 + ((sa & 7) << 5); *a++ = 0x40 + (sa >> 3); *a++ = 0x48;
1060 break;
1061 #endif
1062 }
1063
1064 if (do_store)
1065 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rd]);
1066
1067 *addrp = a;
1068 rd0:
1069 bintrans_write_pc_inc(addrp);
1070 return 1;
1071 }
1072
1073
1074 /*
1075 * bintrans_write_instruction__mfc_mtc():
1076 */
1077 static int bintrans_write_instruction__mfc_mtc(struct memory *mem,
1078 unsigned char **addrp, int coproc_nr, int flag64bit, int rt,
1079 int rd, int mtcflag)
1080 {
1081 unsigned char *a, *failskip;
1082 int ofs;
1083
1084 if (mtcflag && flag64bit) {
1085 /* dmtc */
1086 return 0;
1087 }
1088
1089 /*
1090 * NOTE: Only a few registers are readable without side effects.
1091 */
1092 if (rt == 0 && !mtcflag)
1093 return 0;
1094
1095 if (coproc_nr >= 1)
1096 return 0;
1097
1098 if (rd == COP0_RANDOM || rd == COP0_COUNT)
1099 return 0;
1100
1101 a = *addrp;
1102
1103 /*************************************************************
1104 *
1105 * TODO: Check for kernel mode, or Coproc X usability bit!
1106 *
1107 *************************************************************/
1108
1109 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1110 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1111 *a++ = 0x8b; *a++ = 0x96;
1112 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1113
1114 /* here, edx = cpu->coproc[0] */
1115
1116 if (mtcflag) {
1117 /* mtc */
1118
1119 /* TODO: This code only works for mtc0, not dmtc0 */
1120
1121 /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1122 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1123 *a++ = 0x8b; *a++ = 0x9a;
1124 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1125
1126 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1127
1128 /*
1129 * Here: eax contains the value in register rt,
1130 * ebx contains the coproc register rd value.
1131 *
1132 * In the general case, only allow mtc if it does not
1133 * change the coprocessor register!
1134 */
1135
1136 switch (rd) {
1137
1138 case COP0_INDEX:
1139 break;
1140
1141 case COP0_ENTRYLO0:
1142 case COP0_ENTRYLO1:
1143 /* TODO: Not all bits are writable! */
1144 break;
1145
1146 case COP0_EPC:
1147 break;
1148
1149 case COP0_STATUS:
1150 /* Only allow updates to the status register if
1151 the interrupt enable bits were changed, but no
1152 other bits! */
1153 /* 89 c1 mov %eax,%ecx */
1154 /* 89 da mov %ebx,%edx */
1155 /* 81 e1 00 00 e7 0f and $0x0fe70000,%ecx */
1156 /* 81 e2 00 00 e7 0f and $0x0fe70000,%edx */
1157 /* 39 ca cmp %ecx,%edx */
1158 /* 74 01 je <ok> */
1159 *a++ = 0x89; *a++ = 0xc1;
1160 *a++ = 0x89; *a++ = 0xda;
1161 *a++ = 0x81; *a++ = 0xe1; *a++ = 0x00; *a++ = 0x00;
1162 if (mem->bintrans_32bit_only) {
1163 *a++ = 0xe7; *a++ = 0x0f;
1164 } else {
1165 *a++ = 0xff; *a++ = 0xff;
1166 }
1167 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00;
1168 if (mem->bintrans_32bit_only) {
1169 *a++ = 0xe7; *a++ = 0x0f;
1170 } else {
1171 *a++ = 0xff; *a++ = 0xff;
1172 }
1173 *a++ = 0x39; *a++ = 0xca;
1174 *a++ = 0x74; failskip = a; *a++ = 0x00;
1175 bintrans_write_chunkreturn_fail(&a);
1176 *failskip = (size_t)a - (size_t)failskip - 1;
1177
1178 /* Only allow the update if it would NOT cause
1179 an interrupt exception: */
1180
1181 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1182 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1183 *a++ = 0x8b; *a++ = 0x96;
1184 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1185
1186 /* 8b 9a 38 30 00 00 mov 0x3038(%edx),%ebx */
1187 ofs = ((size_t)&dummy_coproc.reg[COP0_CAUSE]) - (size_t)&dummy_coproc;
1188 *a++ = 0x8b; *a++ = 0x9a;
1189 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1190
1191 /* 21 c3 and %eax,%ebx */
1192 /* 81 e3 00 ff 00 00 and $0xff00,%ebx */
1193 /* 83 fb 00 cmp $0x0,%ebx */
1194 /* 74 01 je <ok> */
1195 *a++ = 0x21; *a++ = 0xc3;
1196 *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00;
1197 *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
1198 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
1199 *a++ = 0x74; failskip = a; *a++ = 0x00;
1200 bintrans_write_chunkreturn_fail(&a);
1201 *failskip = (size_t)a - (size_t)failskip - 1;
1202
1203 break;
1204
1205 default:
1206 /* 39 d8 cmp %ebx,%eax */
1207 /* 74 01 je <ok> */
1208 *a++ = 0x39; *a++ = 0xd8;
1209 *a++ = 0x74; failskip = a; *a++ = 0x00;
1210 bintrans_write_chunkreturn_fail(&a);
1211 *failskip = (size_t)a - (size_t)failskip - 1;
1212 }
1213
1214 /* 8b 96 3c 30 00 00 mov 0x303c(%esi),%edx */
1215 ofs = ((size_t)&dummy_cpu.cd.mips.coproc[0]) - (size_t)&dummy_cpu;
1216 *a++ = 0x8b; *a++ = 0x96;
1217 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1218
1219 /* 8d 9a 38 30 00 00 lea 0x3038(%edx),%ebx */
1220 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1221 *a++ = 0x8d; *a++ = 0x9a;
1222 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1223
1224 /* Sign-extend eax into edx:eax, and store it in
1225 coprocessor register rd: */
1226 /* 99 cltd */
1227 *a++ = 0x99;
1228
1229 /* 89 03 mov %eax,(%ebx) */
1230 /* 89 53 04 mov %edx,0x4(%ebx) */
1231 *a++ = 0x89; *a++ = 0x03;
1232 *a++ = 0x89; *a++ = 0x53; *a++ = 0x04;
1233 } else {
1234 /* mfc */
1235
1236 /* 8b 82 38 30 00 00 mov 0x3038(%edx),%eax */
1237 ofs = ((size_t)&dummy_coproc.reg[rd]) - (size_t)&dummy_coproc;
1238 *a++ = 0x8b; *a++ = 0x82;
1239 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1240
1241 if (flag64bit) {
1242 /* Load high 32 bits: (note: edx gets overwritten) */
1243 /* 8b 92 3c 30 00 00 mov 0x303c(%edx),%edx */
1244 ofs += 4;
1245 *a++ = 0x8b; *a++ = 0x92;
1246 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1247 } else {
1248 /* 99 cltd */
1249 *a++ = 0x99;
1250 }
1251
1252 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1253 }
1254
1255 *addrp = a;
1256 bintrans_write_pc_inc(addrp);
1257 return 1;
1258 }
1259
1260
1261 /*
1262 * bintrans_write_instruction__branch():
1263 */
1264 static int bintrans_write_instruction__branch(unsigned char **addrp,
1265 int instruction_type, int regimm_type, int rt, int rs, int imm)
1266 {
1267 unsigned char *a;
1268 unsigned char *skip1 = NULL, *skip2 = NULL;
1269 int ofs, likely = 0;
1270
1271 switch (instruction_type) {
1272 case HI6_BEQL:
1273 case HI6_BNEL:
1274 case HI6_BLEZL:
1275 case HI6_BGTZL:
1276 likely = 1;
1277 }
1278
1279 /* TODO: See the Alpha backend on how these could be implemented: */
1280 if (likely)
1281 return 0;
1282
1283 a = *addrp;
1284
1285 /*
1286 * edx:eax = gpr[rs]; ecx:ebx = gpr[rt];
1287 *
1288 * Compare for equality (BEQ).
1289 * If the result was zero, then it means equality; perform the
1290 * delayed jump. Otherwise: skip.
1291 */
1292
1293 switch (instruction_type) {
1294 case HI6_BEQ:
1295 case HI6_BNE:
1296 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
1297 /* 89 c3 mov %eax,%ebx */
1298 /* 89 d1 mov %edx,%ecx */
1299 *a++ = 0x89; *a++ = 0xc3; *a++ = 0x89; *a++ = 0xd1;
1300 }
1301 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1302
1303 if (instruction_type == HI6_BEQ && rt != rs) {
1304 /* If rt != rs, then skip. */
1305 /* 39 c3 cmp %eax,%ebx */
1306 /* 75 05 jne 155 <skip> */
1307 /* 39 d1 cmp %edx,%ecx */
1308 /* 75 01 jne 155 <skip> */
1309 *a++ = 0x39; *a++ = 0xc3;
1310 *a++ = 0x75; skip1 = a; *a++ = 0x00;
1311 #if 0
1312 if (!bintrans_32bit_only)
1313 #endif
1314 {
1315 *a++ = 0x39; *a++ = 0xd1;
1316 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1317 }
1318 }
1319
1320 if (instruction_type == HI6_BNE) {
1321 /* If rt != rs, then ok. Otherwise skip. */
1322 #if 0
1323 if (bintrans_32bit_only) {
1324 /* 39 c3 cmp %eax,%ebx */
1325 /* 74 xx je <skip> */
1326 *a++ = 0x39; *a++ = 0xc3;
1327 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1328 } else
1329 #endif
1330 {
1331 /* 39 c3 cmp %eax,%ebx */
1332 /* 75 06 jne 156 <bra> */
1333 /* 39 d1 cmp %edx,%ecx */
1334 /* 75 02 jne 156 <bra> */
1335 /* eb 01 jmp 157 <skip> */
1336 *a++ = 0x39; *a++ = 0xc3;
1337 *a++ = 0x75; *a++ = 0x06;
1338 *a++ = 0x39; *a++ = 0xd1;
1339 *a++ = 0x75; *a++ = 0x02;
1340 *a++ = 0xeb; skip2 = a; *a++ = 0x00;
1341 }
1342 }
1343
1344 if (instruction_type == HI6_BLEZ) {
1345 /* If both eax and edx are zero, then do the branch. */
1346 /* 83 f8 00 cmp $0x0,%eax */
1347 /* 75 07 jne <nott> */
1348 /* 83 fa 00 cmp $0x0,%edx */
1349 /* 75 02 jne 23d <nott> */
1350 /* eb 01 jmp <branch> */
1351 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1352 *a++ = 0x75; *a++ = 0x07;
1353 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1354 *a++ = 0x75; *a++ = 0x02;
1355 *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1356
1357 /* If high bit of edx is set, then rs < 0. */
1358 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1359 /* 74 00 jz skip */
1360 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1361 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1362
1363 if (skip1 != NULL)
1364 *skip1 = (size_t)a - (size_t)skip1 - 1;
1365 skip1 = NULL;
1366 }
1367 if (instruction_type == HI6_BGTZ) {
1368 /* If both eax and edx are zero, then skip the branch. */
1369 /* 83 f8 00 cmp $0x0,%eax */
1370 /* 75 07 jne <nott> */
1371 /* 83 fa 00 cmp $0x0,%edx */
1372 /* 75 02 jne 23d <nott> */
1373 /* eb 01 jmp <skip> */
1374 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1375 *a++ = 0x75; *a++ = 0x07;
1376 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1377 *a++ = 0x75; *a++ = 0x02;
1378 *a++ = 0xeb; skip1 = a; *a++ = 0x00;
1379
1380 /* If high bit of edx is set, then rs < 0. */
1381 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1382 /* 75 00 jnz skip */
1383 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1384 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1385 }
1386 if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BLTZ) {
1387 /* If high bit of edx is set, then rs < 0. */
1388 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1389 /* 74 00 jz skip */
1390 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1391 *a++ = 0x74; skip2 = a; *a++ = 0x00;
1392 }
1393 if (instruction_type == HI6_REGIMM && regimm_type == REGIMM_BGEZ) {
1394 /* If high bit of edx is not set, then rs >= 0. */
1395 /* f7 c2 00 00 00 80 test $0x80000000,%edx */
1396 /* 75 00 jnz skip */
1397 *a++ = 0xf7; *a++ = 0xc2; *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0x80;
1398 *a++ = 0x75; skip2 = a; *a++ = 0x00;
1399 }
1400
1401 /*
1402 * Perform the jump by setting cpu->delay_slot = TO_BE_DELAYED
1403 * and cpu->delay_jmpaddr = pc + 4 + (imm << 2).
1404 */
1405
1406 /* c7 86 38 30 00 00 01 00 00 00 movl $0x1,0x3038(%esi) */
1407 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1408 *a++ = 0xc7; *a++ = 0x86;
1409 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1410 *a++ = TO_BE_DELAYED; *a++ = 0; *a++ = 0; *a++ = 0;
1411
1412 load_pc_into_eax_edx(&a);
1413
1414 /* 05 78 56 34 12 add $0x12345678,%eax */
1415 /* 83 d2 00 adc $0x0,%edx */
1416 /* or */
1417 /* 83 d2 ff adc $0xffffffff,%edx */
1418 imm = (imm << 2) + 4;
1419 *a++ = 0x05; *a++ = imm; *a++ = imm >> 8; *a++ = imm >> 16; *a++ = imm >> 24;
1420 if (imm >= 0) {
1421 *a++ = 0x83; *a++ = 0xd2; *a++ = 0x00;
1422 } else {
1423 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1424 }
1425 store_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1426
1427 if (skip1 != NULL)
1428 *skip1 = (size_t)a - (size_t)skip1 - 1;
1429 if (skip2 != NULL)
1430 *skip2 = (size_t)a - (size_t)skip2 - 1;
1431
1432 *addrp = a;
1433 bintrans_write_pc_inc(addrp);
1434 return 1;
1435 }
1436
1437
1438 /*
1439 * bintrans_write_instruction__delayedbranch():
1440 */
1441 static int bintrans_write_instruction__delayedbranch(struct memory *mem,
1442 unsigned char **addrp, uint32_t *potential_chunk_p, uint32_t *chunks,
1443 int only_care_about_chunk_p, int p, int forward)
1444 {
1445 unsigned char *a, *skip=NULL, *failskip;
1446 int ofs;
1447 uint32_t i386_addr;
1448
1449 a = *addrp;
1450
1451 if (only_care_about_chunk_p)
1452 goto try_chunk_p;
1453
1454 /* Skip all of this if there is no branch: */
1455 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1456
1457 /* 8b 86 38 30 00 00 mov 0x3038(%esi),%eax */
1458 *a++ = 0x8b; *a++ = 0x86;
1459 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1460
1461 /* 83 f8 00 cmp $0x0,%eax */
1462 /* 74 01 je 16b <skippa> */
1463 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1464 *a++ = 0x74; skip = a; *a++ = 0;
1465
1466 /*
1467 * Perform the jump by setting cpu->delay_slot = 0
1468 * and pc = cpu->delay_jmpaddr.
1469 */
1470
1471 /* c7 86 38 30 00 00 00 00 00 00 movl $0x0,0x3038(%esi) */
1472 ofs = ((size_t)&dummy_cpu.cd.mips.delay_slot) - (size_t)&dummy_cpu;
1473 *a++ = 0xc7; *a++ = 0x86;
1474 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1475 *a++ = 0; *a++ = 0; *a++ = 0; *a++ = 0;
1476
1477 /* REMEMBER old pc: */
1478 load_pc_into_eax_edx(&a);
1479 /* 89 c3 mov %eax,%ebx */
1480 /* 89 d1 mov %edx,%ecx */
1481 *a++ = 0x89; *a++ = 0xc3;
1482 *a++ = 0x89; *a++ = 0xd1;
1483 load_into_eax_edx(&a, &dummy_cpu.cd.mips.delay_jmpaddr);
1484 store_eax_edx_into_pc(&a);
1485
1486 try_chunk_p:
1487
1488 if (potential_chunk_p == NULL) {
1489 if (mem->bintrans_32bit_only) {
1490 #if 1
1491 /* 8b 86 78 56 34 12 mov 0x12345678(%esi),%eax */
1492 /* ff e0 jmp *%eax */
1493 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_jump_to_32bit_pc) - (size_t)&dummy_cpu;
1494 *a++ = 0x8b; *a++ = 0x86;
1495 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1496 *a++ = 0xff; *a++ = 0xe0;
1497
1498 #else
1499 /* Don't execute too many instructions. */
1500 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1501 /* 7c 01 jl <okk> */
1502 /* c3 ret */
1503 *a++ = 0x81; *a++ = 0xfd;
1504 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1505 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1506 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1507 bintrans_write_chunkreturn_fail(&a);
1508 *failskip = (size_t)a - (size_t)failskip - 1;
1509
1510 /*
1511 * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
1512 *
1513 * 89 c3 mov %eax,%ebx
1514 * c1 eb 14 shr $20,%ebx
1515 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1516 */
1517 *a++ = 0x89; *a++ = 0xc3;
1518 *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x14;
1519 *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1520
1521 /*
1522 * ecx = vaddr_to_hostaddr_table0
1523 *
1524 * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
1525 */
1526 ofs = ((size_t)&dummy_cpu.cd.mips.vaddr_to_hostaddr_table0) - (size_t)&dummy_cpu;
1527 *a++ = 0x8b; *a++ = 0x8e;
1528 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1529
1530 /*
1531 * ecx = vaddr_to_hostaddr_table0[a]
1532 *
1533 * 8b 0c 19 mov (%ecx,%ebx),%ecx
1534 */
1535 *a++ = 0x8b; *a++ = 0x0c; *a++ = 0x19;
1536
1537 /*
1538 * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
1539 *
1540 * 89 c3 mov %eax,%ebx
1541 * c1 eb 0a shr $10,%ebx
1542 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
1543 */
1544 *a++ = 0x89; *a++ = 0xc3;
1545 *a++ = 0xc1; *a++ = 0xeb; *a++ = 0x0a;
1546 *a++ = 0x81; *a++ = 0xe3; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1547
1548 /*
1549 * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
1550 *
1551 * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
1552 */
1553 ofs = (size_t)&dummy_vth32_table.cd.mips.bintrans_chunks[0]
1554 - (size_t)&dummy_vth32_table;
1555
1556 *a++ = 0x8b; *a++ = 0x8c; *a++ = 0x19;
1557 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1558
1559 /*
1560 * ecx = NULL? Then return with failure.
1561 *
1562 * 83 f9 00 cmp $0x0,%ecx
1563 * 75 01 jne <okzzz>
1564 */
1565 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1566 *a++ = 0x75; fail = a; *a++ = 0x00;
1567 bintrans_write_chunkreturn(&a);
1568 *fail = (size_t)a - (size_t)fail - 1;
1569
1570 /*
1571 * 25 fc 0f 00 00 and $0xffc,%eax
1572 * 01 c1 add %eax,%ecx
1573 *
1574 * 8b 01 mov (%ecx),%eax
1575 *
1576 * 83 f8 00 cmp $0x0,%eax
1577 * 75 01 jne <ok>
1578 * c3 ret
1579 */
1580 *a++ = 0x25; *a++ = 0xfc; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1581 *a++ = 0x01; *a++ = 0xc1;
1582
1583 *a++ = 0x8b; *a++ = 0x01;
1584
1585 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1586 *a++ = 0x75; fail = a; *a++ = 0x01;
1587 bintrans_write_chunkreturn(&a);
1588 *fail = (size_t)a - (size_t)fail - 1;
1589
1590 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1591 /* ff e0 jmp *%eax */
1592 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1593 *a++ = 0x03; *a++ = 0x86;
1594 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1595 *a++ = 0xff; *a++ = 0xe0;
1596 #endif
1597 } else {
1598 /* Not much we can do here if this wasn't to the same physical page... */
1599
1600 /* Don't execute too many instructions. */
1601 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1602 /* 7c 01 jl <okk> */
1603 /* c3 ret */
1604 *a++ = 0x81; *a++ = 0xfd;
1605 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1606 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1607 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1608 bintrans_write_chunkreturn_fail(&a);
1609 *failskip = (size_t)a - (size_t)failskip - 1;
1610
1611 /*
1612 * Compare the old pc (ecx:ebx) and the new pc (edx:eax). If they are on the
1613 * same virtual page (which means that they are on the same physical
1614 * page), then we can check the right chunk pointer, and if it
1615 * is non-NULL, then we can jump there. Otherwise just return.
1616 */
1617
1618 /* Subtract 4 from the old pc first. (This is where the jump originated from.) */
1619 /* 83 eb 04 sub $0x4,%ebx */
1620 /* 83 d9 00 sbb $0x0,%ecx */
1621 *a++ = 0x83; *a++ = 0xeb; *a++ = 0x04;
1622 *a++ = 0x83; *a++ = 0xd9; *a++ = 0x00;
1623
1624 /* 39 d1 cmp %edx,%ecx */
1625 /* 74 01 je 1b9 <ok2> */
1626 /* c3 ret */
1627 *a++ = 0x39; *a++ = 0xd1;
1628 *a++ = 0x74; *a++ = 0x01;
1629 *a++ = 0xc3;
1630
1631 /* Remember new pc: */
1632 /* 89 c1 mov %eax,%ecx */
1633 *a++ = 0x89; *a++ = 0xc1;
1634
1635 /* 81 e3 00 f0 ff ff and $0xfffff000,%ebx */
1636 /* 25 00 f0 ff ff and $0xfffff000,%eax */
1637 *a++ = 0x81; *a++ = 0xe3; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1638 *a++ = 0x25; *a++ = 0x00; *a++ = 0xf0; *a++ = 0xff; *a++ = 0xff;
1639
1640 /* 39 c3 cmp %eax,%ebx */
1641 /* 74 01 je <ok1> */
1642 /* c3 ret */
1643 *a++ = 0x39; *a++ = 0xc3;
1644 *a++ = 0x74; *a++ = 0x01;
1645 *a++ = 0xc3;
1646
1647 /* 81 e1 ff 0f 00 00 and $0xfff,%ecx */
1648 *a++ = 0x81; *a++ = 0xe1; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1649
1650 /* 8b 81 78 56 34 12 mov 0x12345678(%ecx),%eax */
1651 ofs = (size_t)chunks;
1652 *a++ = 0x8b; *a++ = 0x81; *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1653
1654 /* 83 f8 00 cmp $0x0,%eax */
1655 /* 75 01 jne 1cd <okjump> */
1656 /* c3 ret */
1657 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1658 *a++ = 0x75; *a++ = 0x01;
1659 *a++ = 0xc3;
1660
1661 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1662 /* ff e0 jmp *%eax */
1663 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1664 *a++ = 0x03; *a++ = 0x86;
1665 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1666 *a++ = 0xff; *a++ = 0xe0;
1667 }
1668 } else {
1669 /*
1670 * Just to make sure that we don't become too unreliant
1671 * on the main program loop, we need to return every once
1672 * in a while (interrupts etc).
1673 *
1674 * Load the "nr of instructions executed" (which is an int)
1675 * and see if it is below a certain threshold. If so, then
1676 * we go on with the fast path (bintrans), otherwise we
1677 * abort by returning.
1678 */
1679 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
1680 /* 7c 01 jl <okk> */
1681 /* c3 ret */
1682 if (!only_care_about_chunk_p && !forward) {
1683 *a++ = 0x81; *a++ = 0xfd;
1684 *a++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
1685 *a++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *a++ = 0; *a++ = 0;
1686 *a++ = 0x7c; failskip = a; *a++ = 0x01;
1687 bintrans_write_chunkreturn_fail(&a);
1688 *failskip = (size_t)a - (size_t)failskip - 1;
1689 }
1690
1691 /*
1692 * potential_chunk_p points to an "uint32_t".
1693 * If this value is non-NULL, then it is a piece of i386
1694 * machine language code corresponding to the address
1695 * we're jumping to. Otherwise, those instructions haven't
1696 * been translated yet, so we have to return to the main
1697 * loop. (Actually, we have to add cpu->chunk_base_address.)
1698 *
1699 * Case 1: The value is non-NULL already at translation
1700 * time. Then we can make a direct (fast) native
1701 * i386 jump to the code chunk.
1702 *
1703 * Case 2: The value was NULL at translation time, then we
1704 * have to check during runtime.
1705 */
1706
1707 /* Case 1: */
1708 /* printf("%08x ", *potential_chunk_p); */
1709 i386_addr = *potential_chunk_p +
1710 (size_t)mem->translation_code_chunk_space;
1711 i386_addr = i386_addr - ((size_t)a + 5);
1712 if ((*potential_chunk_p) != 0) {
1713 *a++ = 0xe9;
1714 *a++ = i386_addr;
1715 *a++ = i386_addr >> 8;
1716 *a++ = i386_addr >> 16;
1717 *a++ = i386_addr >> 24;
1718 } else {
1719 /* Case 2: */
1720
1721 bintrans_register_potential_quick_jump(mem, a, p);
1722
1723 i386_addr = (size_t)potential_chunk_p;
1724
1725 /*
1726 * Load the chunk pointer into eax.
1727 * If it is NULL (zero), then skip the following jump.
1728 * Add chunk_base_address to eax, and jump to eax.
1729 */
1730
1731 /* a1 78 56 34 12 mov 0x12345678,%eax */
1732 /* 83 f8 00 cmp $0x0,%eax */
1733 /* 75 01 jne <okaa> */
1734 /* c3 ret */
1735 *a++ = 0xa1;
1736 *a++ = i386_addr; *a++ = i386_addr >> 8;
1737 *a++ = i386_addr >> 16; *a++ = i386_addr >> 24;
1738 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
1739 *a++ = 0x75; *a++ = 0x01;
1740 *a++ = 0xc3;
1741
1742 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
1743 /* ff e0 jmp *%eax */
1744 ofs = ((size_t)&dummy_cpu.cd.mips.chunk_base_address) - (size_t)&dummy_cpu;
1745 *a++ = 0x03; *a++ = 0x86;
1746 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
1747 *a++ = 0xff; *a++ = 0xe0;
1748 }
1749 }
1750
1751 if (skip != NULL)
1752 *skip = (size_t)a - (size_t)skip - 1;
1753
1754 *addrp = a;
1755 return 1;
1756 }
1757
1758
1759 /*
1760 * bintrans_write_instruction__loadstore():
1761 */
1762 static int bintrans_write_instruction__loadstore(struct memory *mem,
1763 unsigned char **addrp, int rt, int imm, int rs,
1764 int instruction_type, int bigendian)
1765 {
1766 unsigned char *a, *retfail, *generic64bit, *doloadstore,
1767 *okret0, *okret1, *okret2, *skip;
1768 int ofs, alignment, load=0, unaligned=0;
1769
1770 /* TODO: Not yet: */
1771 if (instruction_type == HI6_LQ_MDMX || instruction_type == HI6_SQ)
1772 return 0;
1773
1774 /* TODO: Not yet: */
1775 if (bigendian)
1776 return 0;
1777
1778 switch (instruction_type) {
1779 case HI6_LQ_MDMX:
1780 case HI6_LDL:
1781 case HI6_LDR:
1782 case HI6_LD:
1783 case HI6_LWU:
1784 case HI6_LWL:
1785 case HI6_LWR:
1786 case HI6_LW:
1787 case HI6_LHU:
1788 case HI6_LH:
1789 case HI6_LBU:
1790 case HI6_LB:
1791 load = 1;
1792 if (rt == 0)
1793 return 0;
1794 }
1795
1796 switch (instruction_type) {
1797 case HI6_LWL:
1798 case HI6_LWR:
1799 case HI6_LDL:
1800 case HI6_LDR:
1801 case HI6_SWL:
1802 case HI6_SWR:
1803 case HI6_SDL:
1804 case HI6_SDR:
1805 unaligned = 1;
1806 }
1807
1808 a = *addrp;
1809
1810 if (mem->bintrans_32bit_only)
1811 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1812 else
1813 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
1814
1815 if (imm & 0x8000) {
1816 /* 05 34 f2 ff ff add $0xfffff234,%eax */
1817 /* 83 d2 ff adc $0xffffffff,%edx */
1818 *a++ = 5;
1819 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
1820 if (!mem->bintrans_32bit_only) {
1821 *a++ = 0x83; *a++ = 0xd2; *a++ = 0xff;
1822 }
1823 } else {
1824 /* 05 34 12 00 00 add $0x1234,%eax */
1825 /* 83 d2 00 adc $0x0,%edx */
1826 *a++ = 5;
1827 *a++ = imm; *a++ = imm >> 8; *a++ = 0; *a++ = 0;
1828 if (!mem->bintrans_32bit_only) {
1829 *a++ = 0x83; *a++ = 0xd2; *a++ = 0;
1830 }
1831 }
1832
1833 alignment = 0;
1834 switch (instruction_type) {
1835 case HI6_LQ_MDMX:
1836 case HI6_SQ:
1837 alignment = 15;
1838 break;
1839 case HI6_LD:
1840 case HI6_LDL:
1841 case HI6_LDR:
1842 case HI6_SD:
1843 case HI6_SDL:
1844 case HI6_SDR:
1845 alignment = 7;
1846 break;
1847 case HI6_LW:
1848 case HI6_LWL:
1849 case HI6_LWR:
1850 case HI6_LWU:
1851 case HI6_SW:
1852 case HI6_SWL:
1853 case HI6_SWR:
1854 alignment = 3;
1855 break;
1856 case HI6_LH:
1857 case HI6_LHU:
1858 case HI6_SH:
1859 alignment = 1;
1860 break;
1861 }
1862
1863 if (unaligned) {
1864 /*
1865 * Perform the actual load/store from an
1866 * aligned address.
1867 *
1868 * 83 e0 fc and $0xfffffffc,%eax
1869 */
1870 *a++ = 0x83; *a++ = 0xe0; *a++ = 0xff - alignment;
1871 } else if (alignment > 0) {
1872 unsigned char *alignskip;
1873 /*
1874 * Check alignment:
1875 *
1876 * 89 c3 mov %eax,%ebx
1877 * 83 e3 01 and $0x1,%ebx
1878 * 74 01 jz <ok>
1879 * c3 ret
1880 */
1881 *a++ = 0x89; *a++ = 0xc3;
1882 *a++ = 0x83; *a++ = 0xe3; *a++ = alignment;
1883 *a++ = 0x74; alignskip = a; *a++ = 0x00;
1884 bintrans_write_chunkreturn_fail(&a);
1885 *alignskip = (size_t)a - (size_t)alignskip - 1;
1886 }
1887
1888
1889 /* Here, edx:eax = vaddr */
1890
1891 if (mem->bintrans_32bit_only) {
1892 /* Call the quick lookup routine: */
1893 ofs = (size_t)bintrans_loadstore_32bit;
1894 ofs = ofs - ((size_t)a + 5);
1895 *a++ = 0xe8; *a++ = ofs; *a++ = ofs >> 8;
1896 *a++ = ofs >> 16; *a++ = ofs >> 24;
1897
1898 /*
1899 * ecx = NULL? Then return with failure.
1900 *
1901 * 83 f9 00 cmp $0x0,%ecx
1902 * 75 01 jne <okzzz>
1903 */
1904 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1905 *a++ = 0x75; retfail = a; *a++ = 0x00;
1906 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1907 *retfail = (size_t)a - (size_t)retfail - 1;
1908
1909 /*
1910 * If the lowest bit is zero, and we're storing, then fail.
1911 */
1912 if (!load) {
1913 /*
1914 * f7 c1 01 00 00 00 test $0x1,%ecx
1915 * 75 01 jne <ok>
1916 */
1917 *a++ = 0xf7; *a++ = 0xc1; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
1918 *a++ = 0x75; retfail = a; *a++ = 0x00;
1919 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1920 *retfail = (size_t)a - (size_t)retfail - 1;
1921 }
1922
1923 /*
1924 * eax = offset within page = vaddr & 0xfff
1925 *
1926 * 25 ff 0f 00 00 and $0xfff,%eax
1927 */
1928 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1929
1930 /*
1931 * ecx = host address ( = host page + offset)
1932 *
1933 * 83 e1 fe and $0xfffffffe,%ecx clear the lowest bit
1934 * 01 c1 add %eax,%ecx
1935 */
1936 *a++ = 0x83; *a++ = 0xe1; *a++ = 0xfe;
1937 *a++ = 0x01; *a++ = 0xc1;
1938 } else {
1939 /*
1940 * If the load/store address has the top 32 bits set to
1941 * 0x00000000 or 0xffffffff, then we can use the 32-bit
1942 * lookup tables:
1943 *
1944
1945 TODO: top 33 bits!!!!!!!
1946
1947 * 83 fa 00 cmp $0x0,%edx
1948 * 74 05 je <ok32>
1949 * 83 fa ff cmp $0xffffffff,%edx
1950 * 75 01 jne <not32>
1951 */
1952 *a++ = 0x83; *a++ = 0xfa; *a++ = 0x00;
1953 *a++ = 0x74; *a++ = 0x05;
1954 *a++ = 0x83; *a++ = 0xfa; *a++ = 0xff;
1955 *a++ = 0x75; generic64bit = a; *a++ = 0x01;
1956
1957 /* Call the quick lookup routine: */
1958 ofs = (size_t)bintrans_loadstore_32bit;
1959 ofs = ofs - ((size_t)a + 5);
1960 *a++ = 0xe8; *a++ = ofs; *a++ = ofs >> 8;
1961 *a++ = ofs >> 16; *a++ = ofs >> 24;
1962
1963 /*
1964 * ecx = NULL? Then return with failure.
1965 *
1966 * 83 f9 00 cmp $0x0,%ecx
1967 * 75 01 jne <okzzz>
1968 */
1969 *a++ = 0x83; *a++ = 0xf9; *a++ = 0x00;
1970 *a++ = 0x75; retfail = a; *a++ = 0x00;
1971 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1972 *retfail = (size_t)a - (size_t)retfail - 1;
1973
1974 /*
1975 * If the lowest bit is zero, and we're storing, then fail.
1976 */
1977 if (!load) {
1978 /*
1979 * f7 c1 01 00 00 00 test $0x1,%ecx
1980 * 75 01 jne <ok>
1981 */
1982 *a++ = 0xf7; *a++ = 0xc1; *a++ = 1; *a++ = 0; *a++ = 0; *a++ = 0;
1983 *a++ = 0x75; retfail = a; *a++ = 0x00;
1984 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
1985 *retfail = (size_t)a - (size_t)retfail - 1;
1986 }
1987
1988 /*
1989 * eax = offset within page = vaddr & 0xfff
1990 *
1991 * 25 ff 0f 00 00 and $0xfff,%eax
1992 */
1993 *a++ = 0x25; *a++ = 0xff; *a++ = 0x0f; *a++ = 0; *a++ = 0;
1994
1995 /*
1996 * ecx = host address ( = host page + offset)
1997 *
1998 * 83 e1 fe and $0xfffffffe,%ecx clear the lowest bit
1999 * 01 c1 add %eax,%ecx
2000 */
2001 *a++ = 0x83; *a++ = 0xe1; *a++ = 0xfe;
2002 *a++ = 0x01; *a++ = 0xc1;
2003
2004 *a++ = 0xeb; doloadstore = a; *a++ = 0x01;
2005
2006
2007 /* TODO: The stuff above is so similar to the pure 32-bit
2008 case that it should be factored out. */
2009
2010
2011 *generic64bit = (size_t)a - (size_t)generic64bit - 1;
2012
2013 /*
2014 * 64-bit generic case:
2015 */
2016
2017 /* push writeflag */
2018 *a++ = 0x6a; *a++ = load? 0 : 1;
2019
2020 /* push vaddr (edx:eax) */
2021 *a++ = 0x52; *a++ = 0x50;
2022
2023 /* push cpu (esi) */
2024 *a++ = 0x56;
2025
2026 /* eax = points to the right function */
2027 ofs = ((size_t)&dummy_cpu.cd.mips.fast_vaddr_to_hostaddr) - (size_t)&dummy_cpu;
2028 *a++ = 0x8b; *a++ = 0x86;
2029 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2030
2031 /* ff d0 call *%eax */
2032 *a++ = 0xff; *a++ = 0xd0;
2033
2034 /* 83 c4 08 add $0x10,%esp */
2035 *a++ = 0x83; *a++ = 0xc4; *a++ = 0x10;
2036
2037 /* If eax is NULL, then return. */
2038 /* 83 f8 00 cmp $0x0,%eax */
2039 /* 75 01 jne 1cd <okjump> */
2040 /* c3 ret */
2041 *a++ = 0x83; *a++ = 0xf8; *a++ = 0x00;
2042 *a++ = 0x75; retfail = a; *a++ = 0x00;
2043 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2044 *retfail = (size_t)a - (size_t)retfail - 1;
2045
2046 /* 89 c1 mov %eax,%ecx */
2047 *a++ = 0x89; *a++ = 0xc1;
2048
2049 *doloadstore = (size_t)a - (size_t)doloadstore - 1;
2050 }
2051
2052
2053 if (!load) {
2054 if (alignment >= 7)
2055 load_into_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2056 else
2057 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2058 }
2059
2060 switch (instruction_type) {
2061 case HI6_LD:
2062 /* 8b 01 mov (%ecx),%eax */
2063 /* 8b 51 04 mov 0x4(%ecx),%edx */
2064 *a++ = 0x8b; *a++ = 0x01;
2065 *a++ = 0x8b; *a++ = 0x51; *a++ = 0x04;
2066 break;
2067 case HI6_LWU:
2068 /* 8b 01 mov (%ecx),%eax */
2069 /* 31 d2 xor %edx,%edx */
2070 *a++ = 0x8b; *a++ = 0x01;
2071 *a++ = 0x31; *a++ = 0xd2;
2072 break;
2073 case HI6_LW:
2074 /* 8b 01 mov (%ecx),%eax */
2075 /* 99 cltd */
2076 *a++ = 0x8b; *a++ = 0x01;
2077 *a++ = 0x99;
2078 break;
2079 case HI6_LHU:
2080 /* 31 c0 xor %eax,%eax */
2081 /* 66 8b 01 mov (%ecx),%ax */
2082 /* 99 cltd */
2083 *a++ = 0x31; *a++ = 0xc0;
2084 *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2085 *a++ = 0x99;
2086 break;
2087 case HI6_LH:
2088 /* 66 8b 01 mov (%ecx),%ax */
2089 /* 98 cwtl */
2090 /* 99 cltd */
2091 *a++ = 0x66; *a++ = 0x8b; *a++ = 0x01;
2092 *a++ = 0x98;
2093 *a++ = 0x99;
2094 break;
2095 case HI6_LBU:
2096 /* 31 c0 xor %eax,%eax */
2097 /* 8a 01 mov (%ecx),%al */
2098 /* 99 cltd */
2099 *a++ = 0x31; *a++ = 0xc0;
2100 *a++ = 0x8a; *a++ = 0x01;
2101 *a++ = 0x99;
2102 break;
2103 case HI6_LB:
2104 /* 8a 01 mov (%ecx),%al */
2105 /* 66 98 cbtw */
2106 /* 98 cwtl */
2107 /* 99 cltd */
2108 *a++ = 0x8a; *a++ = 0x01;
2109 *a++ = 0x66; *a++ = 0x98;
2110 *a++ = 0x98;
2111 *a++ = 0x99;
2112 break;
2113
2114 case HI6_LWL:
2115 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2116 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2117 *a++ = 5;
2118 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2119 /* 83 e0 03 and $0x03,%eax */
2120 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2121 /* 89 c3 mov %eax,%ebx */
2122 *a++ = 0x89; *a++ = 0xc3;
2123
2124 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2125
2126 /* ALIGNED LOAD: */
2127 /* 8b 11 mov (%ecx),%edx */
2128 *a++ = 0x8b; *a++ = 0x11;
2129
2130 /*
2131 * CASE 0:
2132 * memory = 0x12 0x34 0x56 0x78
2133 * register after lwl: 0x12 0x.. 0x.. 0x..
2134 */
2135 /* 83 fb 00 cmp $0x0,%ebx */
2136 /* 75 01 jne <skip> */
2137 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2138 *a++ = 0x75; skip = a; *a++ = 0x01;
2139
2140 /* c1 e2 18 shl $0x18,%edx */
2141 /* 25 ff ff ff 00 and $0xffffff,%eax */
2142 /* 09 d0 or %edx,%eax */
2143 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x18;
2144 *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2145 *a++ = 0x09; *a++ = 0xd0;
2146
2147 /* eb 00 jmp <okret> */
2148 *a++ = 0xeb; okret0 = a; *a++ = 0;
2149
2150 *skip = (size_t)a - (size_t)skip - 1;
2151
2152 /*
2153 * CASE 1:
2154 * memory = 0x12 0x34 0x56 0x78
2155 * register after lwl: 0x34 0x12 0x.. 0x..
2156 */
2157 /* 83 fb 01 cmp $0x1,%ebx */
2158 /* 75 01 jne <skip> */
2159 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2160 *a++ = 0x75; skip = a; *a++ = 0x01;
2161
2162 /* c1 e2 10 shl $0x10,%edx */
2163 /* 25 ff ff 00 00 and $0xffff,%eax */
2164 /* 09 d0 or %edx,%eax */
2165 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x10;
2166 *a++ = 0x25; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2167 *a++ = 0x09; *a++ = 0xd0;
2168
2169 /* eb 00 jmp <okret> */
2170 *a++ = 0xeb; okret1 = a; *a++ = 0;
2171
2172 *skip = (size_t)a - (size_t)skip - 1;
2173
2174 /*
2175 * CASE 2:
2176 * memory = 0x12 0x34 0x56 0x78
2177 * register after lwl: 0x56 0x34 0x12 0x..
2178 */
2179 /* 83 fb 02 cmp $0x2,%ebx */
2180 /* 75 01 jne <skip> */
2181 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2182 *a++ = 0x75; skip = a; *a++ = 0x01;
2183
2184 /* c1 e2 08 shl $0x08,%edx */
2185 /* 25 ff 00 00 00 and $0xff,%eax */
2186 /* 09 d0 or %edx,%eax */
2187 *a++ = 0xc1; *a++ = 0xe2; *a++ = 0x08;
2188 *a++ = 0x25; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2189 *a++ = 0x09; *a++ = 0xd0;
2190
2191 /* eb 00 jmp <okret> */
2192 *a++ = 0xeb; okret2 = a; *a++ = 0;
2193
2194 *skip = (size_t)a - (size_t)skip - 1;
2195
2196 /*
2197 * CASE 3:
2198 * memory = 0x12 0x34 0x56 0x78
2199 * register after lwl: 0x78 0x56 0x34 0x12
2200 */
2201 /* 89 d0 mov %edx,%eax */
2202 *a++ = 0x89; *a++ = 0xd0;
2203
2204 /* okret: */
2205 *okret0 = (size_t)a - (size_t)okret0 - 1;
2206 *okret1 = (size_t)a - (size_t)okret1 - 1;
2207 *okret2 = (size_t)a - (size_t)okret2 - 1;
2208
2209 /* 99 cltd */
2210 *a++ = 0x99;
2211 break;
2212
2213 case HI6_LWR:
2214 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2215 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2216 *a++ = 5;
2217 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2218 /* 83 e0 03 and $0x03,%eax */
2219 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2220 /* 89 c3 mov %eax,%ebx */
2221 *a++ = 0x89; *a++ = 0xc3;
2222
2223 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2224
2225 /* ALIGNED LOAD: */
2226 /* 8b 11 mov (%ecx),%edx */
2227 *a++ = 0x8b; *a++ = 0x11;
2228
2229 /*
2230 * CASE 0:
2231 * memory = 0x12 0x34 0x56 0x78
2232 * register after lwr: 0x78 0x56 0x34 0x12
2233 */
2234 /* 83 fb 00 cmp $0x0,%ebx */
2235 /* 75 01 jne <skip> */
2236 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2237 *a++ = 0x75; skip = a; *a++ = 0x01;
2238
2239 /* 89 d0 mov %edx,%eax */
2240 *a++ = 0x89; *a++ = 0xd0;
2241
2242 /* eb 00 jmp <okret> */
2243 *a++ = 0xeb; okret0 = a; *a++ = 0;
2244
2245 *skip = (size_t)a - (size_t)skip - 1;
2246
2247 /*
2248 * CASE 1:
2249 * memory = 0x12 0x34 0x56 0x78
2250 * register after lwr: 0x.. 0x78 0x56 0x34
2251 */
2252 /* 83 fb 01 cmp $0x1,%ebx */
2253 /* 75 01 jne <skip> */
2254 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2255 *a++ = 0x75; skip = a; *a++ = 0x01;
2256
2257 /* c1 ea 08 shr $0x8,%edx */
2258 /* 25 00 00 00 ff and $0xff000000,%eax */
2259 /* 09 d0 or %edx,%eax */
2260 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x08;
2261 *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2262 *a++ = 0x09; *a++ = 0xd0;
2263
2264 /* eb 00 jmp <okret> */
2265 *a++ = 0xeb; okret1 = a; *a++ = 0;
2266
2267 *skip = (size_t)a - (size_t)skip - 1;
2268
2269 /*
2270 * CASE 2:
2271 * memory = 0x12 0x34 0x56 0x78
2272 * register after lwr: 0x.. 0x.. 0x78 0x56
2273 */
2274 /* 83 fb 02 cmp $0x2,%ebx */
2275 /* 75 01 jne <skip> */
2276 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2277 *a++ = 0x75; skip = a; *a++ = 0x01;
2278
2279 /* c1 ea 10 shr $0x10,%edx */
2280 /* 25 00 00 ff ff and $0xffff0000,%eax */
2281 /* 09 d0 or %edx,%eax */
2282 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x10;
2283 *a++ = 0x25; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2284 *a++ = 0x09; *a++ = 0xd0;
2285
2286 /* eb 00 jmp <okret> */
2287 *a++ = 0xeb; okret2 = a; *a++ = 0;
2288
2289 *skip = (size_t)a - (size_t)skip - 1;
2290
2291 /*
2292 * CASE 3:
2293 * memory = 0x12 0x34 0x56 0x78
2294 * register after lwr: 0x.. 0x.. 0x.. 0x78
2295 */
2296 /* c1 ea 18 shr $0x18,%edx */
2297 /* 25 00 ff ff ff and $0xffffff00,%eax */
2298 /* 09 d0 or %edx,%eax */
2299 *a++ = 0xc1; *a++ = 0xea; *a++ = 0x18;
2300 *a++ = 0x25; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2301 *a++ = 0x09; *a++ = 0xd0;
2302
2303 /* okret: */
2304 *okret0 = (size_t)a - (size_t)okret0 - 1;
2305 *okret1 = (size_t)a - (size_t)okret1 - 1;
2306 *okret2 = (size_t)a - (size_t)okret2 - 1;
2307
2308 /* 99 cltd */
2309 *a++ = 0x99;
2310 break;
2311
2312 case HI6_SD:
2313 /* 89 01 mov %eax,(%ecx) */
2314 /* 89 51 04 mov %edx,0x4(%ecx) */
2315 *a++ = 0x89; *a++ = 0x01;
2316 *a++ = 0x89; *a++ = 0x51; *a++ = 0x04;
2317 break;
2318 case HI6_SW:
2319 /* 89 01 mov %eax,(%ecx) */
2320 *a++ = 0x89; *a++ = 0x01;
2321 break;
2322 case HI6_SH:
2323 /* 66 89 01 mov %ax,(%ecx) */
2324 *a++ = 0x66; *a++ = 0x89; *a++ = 0x01;
2325 break;
2326 case HI6_SB:
2327 /* 88 01 mov %al,(%ecx) */
2328 *a++ = 0x88; *a++ = 0x01;
2329 break;
2330
2331 case HI6_SWL:
2332 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2333 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2334 *a++ = 5;
2335 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2336 /* 83 e0 03 and $0x03,%eax */
2337 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2338 /* 89 c3 mov %eax,%ebx */
2339 *a++ = 0x89; *a++ = 0xc3;
2340
2341 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2342
2343 /* ALIGNED LOAD: */
2344 /* 8b 11 mov (%ecx),%edx */
2345 *a++ = 0x8b; *a++ = 0x11;
2346
2347 /*
2348 * CASE 0:
2349 * memory (edx): 0x12 0x34 0x56 0x78
2350 * register (eax): 0x89abcdef
2351 * mem after swl: 0x89 0x.. 0x.. 0x..
2352 */
2353 /* 83 fb 00 cmp $0x0,%ebx */
2354 /* 75 01 jne <skip> */
2355 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2356 *a++ = 0x75; skip = a; *a++ = 0x01;
2357
2358 /* 81 e2 00 ff ff ff and $0xffffff00,%edx */
2359 /* c1 e8 18 shr $0x18,%eax */
2360 /* 09 d0 or %edx,%eax */
2361 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff;
2362 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x18;
2363 *a++ = 0x09; *a++ = 0xd0;
2364
2365 /* eb 00 jmp <okret> */
2366 *a++ = 0xeb; okret0 = a; *a++ = 0;
2367
2368 *skip = (size_t)a - (size_t)skip - 1;
2369
2370 /*
2371 * CASE 1:
2372 * memory (edx): 0x12 0x34 0x56 0x78
2373 * register (eax): 0x89abcdef
2374 * mem after swl: 0xab 0x89 0x.. 0x..
2375 */
2376 /* 83 fb 01 cmp $0x1,%ebx */
2377 /* 75 01 jne <skip> */
2378 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2379 *a++ = 0x75; skip = a; *a++ = 0x01;
2380
2381 /* 81 e2 00 00 ff ff and $0xffff0000,%edx */
2382 /* c1 e8 10 shr $0x10,%eax */
2383 /* 09 d0 or %edx,%eax */
2384 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff; *a++ = 0xff;
2385 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x10;
2386 *a++ = 0x09; *a++ = 0xd0;
2387
2388 /* eb 00 jmp <okret> */
2389 *a++ = 0xeb; okret1 = a; *a++ = 0;
2390
2391 *skip = (size_t)a - (size_t)skip - 1;
2392
2393 /*
2394 * CASE 2:
2395 * memory (edx): 0x12 0x34 0x56 0x78
2396 * register (eax): 0x89abcdef
2397 * mem after swl: 0xcd 0xab 0x89 0x..
2398 */
2399 /* 83 fb 02 cmp $0x2,%ebx */
2400 /* 75 01 jne <skip> */
2401 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2402 *a++ = 0x75; skip = a; *a++ = 0x01;
2403
2404 /* 81 e2 00 00 00 ff and $0xff000000,%edx */
2405 /* c1 e8 08 shr $0x08,%eax */
2406 /* 09 d0 or %edx,%eax */
2407 *a++ = 0x81; *a++ = 0xe2; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00; *a++ = 0xff;
2408 *a++ = 0xc1; *a++ = 0xe8; *a++ = 0x08;
2409 *a++ = 0x09; *a++ = 0xd0;
2410
2411 /* eb 00 jmp <okret> */
2412 *a++ = 0xeb; okret2 = a; *a++ = 0;
2413
2414 *skip = (size_t)a - (size_t)skip - 1;
2415
2416 /*
2417 * CASE 3:
2418 * memory (edx): 0x12 0x34 0x56 0x78
2419 * register (eax): 0x89abcdef
2420 * mem after swl: 0xef 0xcd 0xab 0x89
2421 */
2422 /* eax = eax :-) */
2423
2424 /* okret: */
2425 *okret0 = (size_t)a - (size_t)okret0 - 1;
2426 *okret1 = (size_t)a - (size_t)okret1 - 1;
2427 *okret2 = (size_t)a - (size_t)okret2 - 1;
2428
2429 /* Store back to memory: */
2430 /* 89 01 mov %eax,(%ecx) */
2431 *a++ = 0x89; *a++ = 0x01;
2432 break;
2433
2434 case HI6_SWR:
2435 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rs]);
2436 /* 05 34 f2 ff ff add $0xfffff234,%eax */
2437 *a++ = 5;
2438 *a++ = imm; *a++ = imm >> 8; *a++ = 0xff; *a++ = 0xff;
2439 /* 83 e0 03 and $0x03,%eax */
2440 *a++ = 0x83; *a++ = 0xe0; *a++ = alignment;
2441 /* 89 c3 mov %eax,%ebx */
2442 *a++ = 0x89; *a++ = 0xc3;
2443
2444 load_into_eax_dont_care_about_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2445
2446 /* ALIGNED LOAD: */
2447 /* 8b 11 mov (%ecx),%edx */
2448 *a++ = 0x8b; *a++ = 0x11;
2449
2450 /*
2451 * CASE 0:
2452 * memory (edx): 0x12 0x34 0x56 0x78
2453 * register (eax): 0x89abcdef
2454 * mem after swr: 0xef 0xcd 0xab 0x89
2455 */
2456 /* 83 fb 00 cmp $0x0,%ebx */
2457 /* 75 01 jne <skip> */
2458 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x00;
2459 *a++ = 0x75; skip = a; *a++ = 0x01;
2460
2461 /* eax = eax, so do nothing */
2462
2463 /* eb 00 jmp <okret> */
2464 *a++ = 0xeb; okret0 = a; *a++ = 0;
2465
2466 *skip = (size_t)a - (size_t)skip - 1;
2467
2468 /*
2469 * CASE 1:
2470 * memory (edx): 0x12 0x34 0x56 0x78
2471 * register (eax): 0x89abcdef
2472 * mem after swr: 0x12 0xef 0xcd 0xab
2473 */
2474 /* 83 fb 01 cmp $0x1,%ebx */
2475 /* 75 01 jne <skip> */
2476 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x01;
2477 *a++ = 0x75; skip = a; *a++ = 0x01;
2478
2479 /* 81 e2 ff 00 00 00 and $0x000000ff,%edx */
2480 /* c1 e0 08 shl $0x08,%eax */
2481 /* 09 d0 or %edx,%eax */
2482 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00; *a++ = 0x00;
2483 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x08;
2484 *a++ = 0x09; *a++ = 0xd0;
2485
2486 /* eb 00 jmp <okret> */
2487 *a++ = 0xeb; okret1 = a; *a++ = 0;
2488
2489 *skip = (size_t)a - (size_t)skip - 1;
2490
2491 /*
2492 * CASE 2:
2493 * memory (edx): 0x12 0x34 0x56 0x78
2494 * register (eax): 0x89abcdef
2495 * mem after swr: 0x12 0x34 0xef 0xcd
2496 */
2497 /* 83 fb 02 cmp $0x2,%ebx */
2498 /* 75 01 jne <skip> */
2499 *a++ = 0x83; *a++ = 0xfb; *a++ = 0x02;
2500 *a++ = 0x75; skip = a; *a++ = 0x01;
2501
2502 /* 81 e2 ff ff 00 00 and $0x0000ffff,%edx */
2503 /* c1 e0 10 shl $0x10,%eax */
2504 /* 09 d0 or %edx,%eax */
2505 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00; *a++ = 0x00;
2506 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x10;
2507 *a++ = 0x09; *a++ = 0xd0;
2508
2509 /* eb 00 jmp <okret> */
2510 *a++ = 0xeb; okret2 = a; *a++ = 0;
2511
2512 *skip = (size_t)a - (size_t)skip - 1;
2513
2514 /*
2515 * CASE 3:
2516 * memory (edx): 0x12 0x34 0x56 0x78
2517 * register (eax): 0x89abcdef
2518 * mem after swr: 0x12 0x34 0x56 0xef
2519 */
2520 /* 81 e2 ff ff ff 00 and $0x00ffffff,%edx */
2521 /* c1 e0 18 shl $0x18,%eax */
2522 /* 09 d0 or %edx,%eax */
2523 *a++ = 0x81; *a++ = 0xe2; *a++ = 0xff; *a++ = 0xff; *a++ = 0xff; *a++ = 0x00;
2524 *a++ = 0xc1; *a++ = 0xe0; *a++ = 0x18;
2525 *a++ = 0x09; *a++ = 0xd0;
2526
2527
2528 /* okret: */
2529 *okret0 = (size_t)a - (size_t)okret0 - 1;
2530 *okret1 = (size_t)a - (size_t)okret1 - 1;
2531 *okret2 = (size_t)a - (size_t)okret2 - 1;
2532
2533 /* Store back to memory: */
2534 /* 89 01 mov %eax,(%ecx) */
2535 *a++ = 0x89; *a++ = 0x01;
2536 break;
2537
2538 default:
2539 bintrans_write_chunkreturn_fail(&a); /* ret (and fail) */
2540 }
2541
2542 if (load && rt != 0)
2543 store_eax_edx(&a, &dummy_cpu.cd.mips.gpr[rt]);
2544
2545 *addrp = a;
2546 bintrans_write_pc_inc(addrp);
2547 return 1;
2548 }
2549
2550
2551 /*
2552 * bintrans_write_instruction__tlb_rfe_etc():
2553 */
2554 static int bintrans_write_instruction__tlb_rfe_etc(unsigned char **addrp,
2555 int itype)
2556 {
2557 unsigned char *a;
2558 int ofs;
2559
2560 switch (itype) {
2561 case CALL_TLBP:
2562 case CALL_TLBR:
2563 case CALL_TLBWR:
2564 case CALL_TLBWI:
2565 case CALL_RFE:
2566 case CALL_ERET:
2567 case CALL_SYSCALL:
2568 case CALL_BREAK:
2569 break;
2570 default:
2571 return 0;
2572 }
2573
2574 a = *addrp;
2575
2576 /* Put back PC into the cpu struct, both as pc and pc_last */
2577 *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc&255;
2578 *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2579 *a++ = (ofs_pc>>24)&255; /* mov %edi,pc(%esi) */
2580
2581 *a++ = 0x89; *a++ = 0xbe; *a++ = ofs_pc_last&255;
2582 *a++ = (ofs_pc_last>>8)&255; *a++ = (ofs_pc_last>>16)&255;
2583 *a++ = (ofs_pc_last>>24)&255; /* mov %edi,pc_last(%esi) */
2584
2585 /* ... and make sure that the high 32 bits are ALSO in pc_last: */
2586 /* 8b 86 38 12 00 00 mov 0x1238(%esi),%eax */
2587 ofs = ofs_pc + 4;
2588 *a++ = 0x8b; *a++ = 0x86; *a++ = ofs&255;
2589 *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2590 *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2591
2592 /* 89 86 34 12 00 00 mov %eax,0x1234(%esi) */
2593 ofs = ofs_pc_last + 4;
2594 *a++ = 0x89; *a++ = 0x86; *a++ = ofs&255;
2595 *a++ = (ofs>>8)&255; *a++ = (ofs>>16)&255;
2596 *a++ = (ofs>>24)&255; /* mov %edi,pc(%esi) */
2597
2598 switch (itype) {
2599 case CALL_TLBP:
2600 case CALL_TLBR:
2601 /* push readflag */
2602 *a++ = 0x6a; *a++ = (itype == CALL_TLBR);
2603 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbpr) - (size_t)&dummy_cpu;
2604 break;
2605 case CALL_TLBWR:
2606 case CALL_TLBWI:
2607 /* push randomflag */
2608 *a++ = 0x6a; *a++ = (itype == CALL_TLBWR);
2609 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_tlbwri) - (size_t)&dummy_cpu;
2610 break;
2611 case CALL_SYSCALL:
2612 case CALL_BREAK:
2613 /* push randomflag */
2614 *a++ = 0x6a; *a++ = (itype == CALL_BREAK? EXCEPTION_BP : EXCEPTION_SYS);
2615 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_simple_exception) - (size_t)&dummy_cpu;
2616 break;
2617 case CALL_RFE:
2618 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_rfe) - (size_t)&dummy_cpu;
2619 break;
2620 case CALL_ERET:
2621 ofs = ((size_t)&dummy_cpu.cd.mips.bintrans_fast_eret) - (size_t)&dummy_cpu;
2622 break;
2623 }
2624
2625 /* push cpu (esi) */
2626 *a++ = 0x56;
2627
2628 /* eax = points to the right function */
2629 *a++ = 0x8b; *a++ = 0x86;
2630 *a++ = ofs; *a++ = ofs >> 8; *a++ = ofs >> 16; *a++ = ofs >> 24;
2631
2632 /* ff d0 call *%eax */
2633 *a++ = 0xff; *a++ = 0xd0;
2634
2635 switch (itype) {
2636 case CALL_RFE:
2637 case CALL_ERET:
2638 /* 83 c4 04 add $4,%esp */
2639 *a++ = 0x83; *a++ = 0xc4; *a++ = 4;
2640 break;
2641 default:
2642 /* 83 c4 08 add $8,%esp */
2643 *a++ = 0x83; *a++ = 0xc4; *a++ = 8;
2644 break;
2645 }
2646
2647 /* Load PC from the cpu struct. */
2648 *a++ = 0x8b; *a++ = 0xbe; *a++ = ofs_pc&255;
2649 *a++ = (ofs_pc>>8)&255; *a++ = (ofs_pc>>16)&255;
2650 *a++ = (ofs_pc>>24)&255; /* mov pc(%esi),%edi */
2651
2652 *addrp = a;
2653
2654 switch (itype) {
2655 case CALL_ERET:
2656 case CALL_SYSCALL:
2657 case CALL_BREAK:
2658 break;
2659 default:
2660 bintrans_write_pc_inc(addrp);
2661 }
2662
2663 return 1;
2664 }
2665
2666
2667 /*
2668 * bintrans_backend_init():
2669 *
2670 * This is neccessary for broken GCC 2.x. (For GCC 3.x, this wouldn't be
2671 * neccessary, and the old code would have worked.)
2672 */
2673 static void bintrans_backend_init(void)
2674 {
2675 int size;
2676 unsigned char *p;
2677
2678
2679 /* "runchunk": */
2680 size = 64; /* NOTE: This MUST be enough, or we fail */
2681 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2682 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2683
2684 /* If mmap() failed, try malloc(): */
2685 if (p == NULL) {
2686 p = malloc(size);
2687 if (p == NULL) {
2688 fprintf(stderr, "bintrans_backend_init():"
2689 " out of memory\n");
2690 exit(1);
2691 }
2692 }
2693
2694 bintrans_runchunk = (void *)p;
2695
2696 *p++ = 0x57; /* push %edi */
2697 *p++ = 0x56; /* push %esi */
2698 *p++ = 0x55; /* push %ebp */
2699 *p++ = 0x53; /* push %ebx */
2700
2701 /*
2702 * In all translated code, esi points to the cpu struct, and
2703 * ebp is the nr of executed (translated) instructions.
2704 */
2705
2706 /* 0=ebx, 4=ebp, 8=esi, 0xc=edi, 0x10=retaddr, 0x14=arg0, 0x18=arg1 */
2707
2708 /* mov 0x8(%esp,1),%esi */
2709 *p++ = 0x8b; *p++ = 0x74; *p++ = 0x24; *p++ = 0x14;
2710
2711 /* mov nr_instr(%esi),%ebp */
2712 *p++ = 0x8b; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2713 *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2714
2715 /* mov pc(%esi),%edi */
2716 *p++ = 0x8b; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2717 *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2718
2719 /* call *0x18(%esp,1) */
2720 *p++ = 0xff; *p++ = 0x54; *p++ = 0x24; *p++ = 0x18;
2721
2722 /* mov %ebp,0x1234(%esi) */
2723 *p++ = 0x89; *p++ = 0xae; *p++ = ofs_i&255; *p++ = (ofs_i>>8)&255;
2724 *p++ = (ofs_i>>16)&255; *p++ = (ofs_i>>24)&255;
2725
2726 /* mov %edi,pc(%esi) */
2727 *p++ = 0x89; *p++ = 0xbe; *p++ = ofs_pc&255; *p++ = (ofs_pc>>8)&255;
2728 *p++ = (ofs_pc>>16)&255; *p++ = (ofs_pc>>24)&255;
2729
2730 *p++ = 0x5b; /* pop %ebx */
2731 *p++ = 0x5d; /* pop %ebp */
2732 *p++ = 0x5e; /* pop %esi */
2733 *p++ = 0x5f; /* pop %edi */
2734 *p++ = 0xc3; /* ret */
2735
2736
2737
2738 /* "jump_to_32bit_pc": */
2739 size = 128; /* NOTE: This MUST be enough, or we fail */
2740 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2741 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2742
2743 /* If mmap() failed, try malloc(): */
2744 if (p == NULL) {
2745 p = malloc(size);
2746 if (p == NULL) {
2747 fprintf(stderr, "bintrans_backend_init():"
2748 " out of memory\n");
2749 exit(1);
2750 }
2751 }
2752
2753 bintrans_jump_to_32bit_pc = (void *)p;
2754
2755 /* Don't execute too many instructions. */
2756 /* 81 fd f0 1f 00 00 cmpl $0x1ff0,%ebp */
2757 /* 7c 01 jl <okk> */
2758 /* c3 ret */
2759 *p++ = 0x81; *p++ = 0xfd; *p++ = (N_SAFE_BINTRANS_LIMIT-1) & 255;
2760 *p++ = ((N_SAFE_BINTRANS_LIMIT-1) >> 8) & 255; *p++ = 0; *p++ = 0;
2761 *p++ = 0x7c; *p++ = 0x01;
2762 *p++ = 0xc3;
2763
2764 /*
2765 * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
2766 *
2767 * 89 c3 mov %eax,%ebx
2768 * c1 eb 14 shr $20,%ebx
2769 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2770 */
2771 *p++ = 0x89; *p++ = 0xc3;
2772 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x14;
2773 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2774
2775 /*
2776 * ecx = vaddr_to_hostaddr_table0
2777 *
2778 * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
2779 */
2780 *p++ = 0x8b; *p++ = 0x8e;
2781 *p++ = ofs_tabl0 & 255; *p++ = (ofs_tabl0 >> 8) & 255;
2782 *p++ = (ofs_tabl0 >> 16) & 255; *p++ = (ofs_tabl0 >> 24) & 255;
2783
2784 /*
2785 * ecx = vaddr_to_hostaddr_table0[a]
2786 *
2787 * 8b 0c 19 mov (%ecx,%ebx),%ecx
2788 */
2789 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x19;
2790
2791 /*
2792 * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
2793 *
2794 * 89 c3 mov %eax,%ebx
2795 * c1 eb 0a shr $10,%ebx
2796 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2797 */
2798 *p++ = 0x89; *p++ = 0xc3;
2799 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0a;
2800 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2801
2802 /*
2803 * ecx = vaddr_to_hostaddr_table0[a][b].cd.mips.chunks
2804 *
2805 * 8b 8c 19 56 34 12 00 mov 0x123456(%ecx,%ebx,1),%ecx
2806 */
2807 *p++ = 0x8b; *p++ = 0x8c; *p++ = 0x19; *p++ = ofs_chunks & 255;
2808 *p++ = (ofs_chunks >> 8) & 255; *p++ = (ofs_chunks >> 16) & 255;
2809 *p++ = (ofs_chunks >> 24) & 255;
2810
2811 /*
2812 * ecx = NULL? Then return with failure.
2813 *
2814 * 83 f9 00 cmp $0x0,%ecx
2815 * 75 01 jne <okzzz>
2816 */
2817 *p++ = 0x83; *p++ = 0xf9; *p++ = 0x00;
2818 *p++ = 0x75; *p++ = 0x01;
2819 *p++ = 0xc3; /* TODO: failure? */
2820
2821 /*
2822 * 25 fc 0f 00 00 and $0xffc,%eax
2823 * 01 c1 add %eax,%ecx
2824 *
2825 * 8b 01 mov (%ecx),%eax
2826 *
2827 * 83 f8 00 cmp $0x0,%eax
2828 * 75 01 jne <ok>
2829 * c3 ret
2830 */
2831 *p++ = 0x25; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2832 *p++ = 0x01; *p++ = 0xc1;
2833
2834 *p++ = 0x8b; *p++ = 0x01;
2835
2836 *p++ = 0x83; *p++ = 0xf8; *p++ = 0x00;
2837 *p++ = 0x75; *p++ = 0x01;
2838 *p++ = 0xc3; /* TODO: failure? */
2839
2840 /* 03 86 78 56 34 12 add 0x12345678(%esi),%eax */
2841 /* ff e0 jmp *%eax */
2842 *p++ = 0x03; *p++ = 0x86; *p++ = ofs_chunkbase & 255;
2843 *p++ = (ofs_chunkbase >> 8) & 255; *p++ = (ofs_chunkbase >> 16) & 255;
2844 *p++ = (ofs_chunkbase >> 24) & 255;
2845 *p++ = 0xff; *p++ = 0xe0;
2846
2847
2848
2849 /* "loadstore_32bit": */
2850 size = 48; /* NOTE: This MUST be enough, or we fail */
2851 p = (unsigned char *)mmap(NULL, size, PROT_READ | PROT_WRITE |
2852 PROT_EXEC, MAP_ANON | MAP_PRIVATE, -1, 0);
2853
2854 /* If mmap() failed, try malloc(): */
2855 if (p == NULL) {
2856 p = malloc(size);
2857 if (p == NULL) {
2858 fprintf(stderr, "bintrans_backend_init():"
2859 " out of memory\n");
2860 exit(1);
2861 }
2862 }
2863
2864 bintrans_loadstore_32bit = (void *)p;
2865
2866 /*
2867 * ebx = ((vaddr >> 22) & 1023) * sizeof(void *)
2868 *
2869 * 89 c3 mov %eax,%ebx
2870 * c1 eb 14 shr $20,%ebx
2871 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2872 */
2873 *p++ = 0x89; *p++ = 0xc3;
2874 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x14;
2875 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2876
2877 /*
2878 * ecx = vaddr_to_hostaddr_table0
2879 *
2880 * 8b 8e 34 12 00 00 mov 0x1234(%esi),%ecx
2881 */
2882 *p++ = 0x8b; *p++ = 0x8e; *p++ = ofs_tabl0 & 255;
2883 *p++ = (ofs_tabl0 >> 8) & 255;
2884 *p++ = (ofs_tabl0 >> 16) & 255; *p++ = (ofs_tabl0 >> 24) & 255;
2885
2886 /*
2887 * ecx = vaddr_to_hostaddr_table0[a]
2888 *
2889 * 8b 0c 19 mov (%ecx,%ebx),%ecx
2890 */
2891 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x19;
2892
2893 /*
2894 * ebx = ((vaddr >> 12) & 1023) * sizeof(void *)
2895 *
2896 * 89 c3 mov %eax,%ebx
2897 * c1 eb 0a shr $10,%ebx
2898 * 81 e3 fc 0f 00 00 and $0xffc,%ebx
2899 */
2900 *p++ = 0x89; *p++ = 0xc3;
2901 *p++ = 0xc1; *p++ = 0xeb; *p++ = 0x0a;
2902 *p++ = 0x81; *p++ = 0xe3; *p++ = 0xfc; *p++ = 0x0f; *p++ = 0; *p++ = 0;
2903
2904 /*
2905 * ecx = vaddr_to_hostaddr_table0[a][b]
2906 *
2907 * 8b 0c 19 mov (%ecx,%ebx,1),%ecx
2908 */
2909 *p++ = 0x8b; *p++ = 0x0c; *p++ = 0x19;
2910
2911 /* ret */
2912 *p++ = 0xc3;
2913 }
2914

  ViewVC Help
Powered by ViewVC 1.1.26