--- trunk/doc/technical.html 2007/10/08 16:18:00 4 +++ trunk/doc/technical.html 2007/10/08 16:18:38 12 @@ -1,19 +1,16 @@ - -
|
- - + Back to the index
-This page describes some of the internals of GXemul. +
This page describes some of the internals of GXemul.
-In reality, a lot of things need to be handled. Before each instruction is -executed, the emulator checks to see if any interrupts are asserted which -are not masked away. If so, then an INT exception is generated. Exceptions -cause the program counter to be set to a specific value, and some of the -system coprocessor's registers to be set to values signifying what kind of -exception it was (an interrupt exception in this case). - -
-Reading instructions from memory is done through a TLB, a translation -lookaside buffer. The TLB on MIPS is software controlled, which means that -the program running inside the emulator (for example an operating system -kernel) has to take care of manually updating the TLB. Some memory -addresses are translated into physical addresses directly, some are -translated into valid physical addresses via the TLB, and some memory -references are not valid. Invalid memory references cause exceptions. - -
-After an instruction has been read from memory, the emulator checks which -opcode it contains and executes the instruction. Executing an instruction -usually involves reading some register and writing some register, or perhaps a -load from memory (or a store to memory). The program counter is increased -for every instruction. - -
-Some memory references point to physical addresses which are not in the -normal RAM address space. They may point to hardware devices. If that is -the case, then loads and stores are converted into calls to a device -access function. The device access function is then responsible for -handling these reads and writes. For example, a graphical framebuffer -device may put a pixel on the screen when a value is written to it, or a -serial controller device may output a character to stdout when written to. - -
-Mode a is very slow. On a 2.8 GHz Intel Xeon host the resulting -emulated machine is rougly equal to a 7 MHz R3000 (or a 3.5 MHz R4000). -The actual performance varies a lot, maybe between 5 and 10 million -instructions per second, depending on workload. - -
-Mode b ("bintrans") is still to be considered experimental, but -gives higher performance than mode a. It translates MIPS machine -code into machine code that can be executed on the host machine -on-the-fly. The translation itself obviously takes some time, but this is -usually made up for by the fact that the translated code chunks are -executed multiple times. -To run the emulator with binary translation enabled, just add -b -to the command line. - -
-Only small pieces of MIPS machine code are translated, usually the size of -a function, or less. There is no "intermediate representation" code, so -all translations are done directly from MIPS to host machine code. - -
-The default bintrans cache size is 16 MB, but you can change this by adding --DDEFAULT_BINTRANS_SIZE_IN_MB=xx to your CFLAGS environment variable -before running the configure script, or by using the bintrans_size() -configuration file option when running the emulator. +
-By default, an emulated OS running under DECstation emulation which listens to -interrupts from the mc146818 clock will get interrupts that are close to the -host's clock. That is, if the emulated OS says it wants 100 interrupts per -second, it will get approximately 100 interrupts per real second. +So, how fast is GXemul? There is no good answer to this. There is +especially no answer to the question What is the slowdown factor?, +because the host architecture and emulated architecture can usually not be +compared just like that. + +
Performance depends on several factors, including (but not limited to) +host architecture, host clock speed, which compiler and compiler flags +were used to build the emulator, what the workload is, and so on. For +example, if an emulated operating system tries to read a block from disk, +from its point of view the read was instantaneous (no waiting). So 1 MIPS +in an emulated OS might have taken more than one million instructions on a +real machine. + +
Also, if the emulator says it has executed 1 million instructions, and +the CPU family in question was capable of scalar execution (i.e. one cycle +per instruction), it might still have taken more than 1 million cycles on +a real machine because of cache misses and similar micro-architectural +penalties that are not simulated by GXemul. + +
Because of these issues, it is in my opinion best to measure +performance as the actual (real-world) time it takes to perform a task +with the emulator. Typical examples would be "How long does it take to +install NetBSD?", or "How long does it take to compile XYZ inside NetBSD +in the emulator?". + +
The emulation technique used varies depending on which processor type +is being emulated. (One of my main goals with GXemul is to experiment with +different kinds of emulation, so these might change in the future.) -
-There is however a -I option, which sets the number of emulated cycles per -seconds to a fixed value. Let's say you wish to make the emulated OS think it -is running on a 40 MHz DECstation, and not a 7 MHz one, then you can add --I 40000000 to the command line. This will not make the emulation faster, of -course. It might even make it seem slower; for example, if NetBSD/pmax waits -2 seconds for SCSI devices to settle during bootup, those 2 seconds will take -2*40000000 cycles (which will take more time than 2*7000000). +
+
-The -I option is also necessary if you want to run deterministic experiments, -if a mc146818 device is present. -
-Some emulators make claims such as "x times slowdown," but in the case of -GXemul, the host is often not a MIPS-based machine, and hence comparing -one MIPS instruction to a host instruction doesn't work. Performance depends on -a lot of factors, including (but not limited to) host architecture, host speed, -which compiler and compiler flags were used to build GXemul, what the -workload is, and so on. For example, if an emulated operating system tries -to read a block from disk, from its point of view the read was instantaneous -(no waiting). So 1 MIPS in an emulated OS might have taken more than one -million instructions on a real machine. Because of this, imho it is best -to measure performance as the actual (real-world) time it takes to perform -a task with the emulator. @@ -181,10 +123,13 @@
Running an entire operating system under emulation is very interesting +in itself, but for several reasons, running a modern OS without access to +TCP/IP networking is a bit akward. Hence, I feel the need to implement +TCP/IP (networking) support in the emulator.
As far as I have understood it, there seems to be two different ways to go: @@ -205,6 +150,8 @@ connect from the emulated OS to the OS running on the host, as packets sent out on the host's NIC are not received by itself. (?) +
or @@ -233,8 +180,10 @@ -Other emulators that I have heard of seem to use the first one, if they -support networking. +
+Some emulators/simulators use the first approach, while others use the +second. I think that SIMH and QEMU are examples of emulators using the +first and second approach, respectively.
Since I have choosen the second kind of implementation, I have to write @@ -249,7 +198,7 @@ and converted to ARP responses. (This is used by the emulated OS to find out the MAC address of the gateway.)
The gateway machine, which is the only "other" machine that the emulated OS sees on its emulated network, works as a NAT-style firewall/gateway. It -has a fixed IPv4 address of 10.0.0.254. An OS running in the emulator -can thus have any 10.x.x.x address; a typical choice would be 10.0.0.1. +usually has a fixed IPv4 address of 10.0.0.254. An OS running in +the emulator would usually have an address of the form 10.x.x.x; +a typical choice would be 10.0.0.1.
-Inside emulated NetBSD or OpenBSD, running the following commands should -configure the emulated NIC: +Inside emulated NetBSD/pmax or OpenBSD/pmax, running the following +commands should configure the emulated NIC:
# ifconfig le0 10.0.0.1 # route add default 10.0.0.254 add net default: gateway 10.0.0.254+
If you want nameserver lookups to work, you need a valid /etc/resolv.conf as well:
# echo nameserver 129.16.1.3 > /etc/resolv.conf-(But replace 129.16.1.3 with the actual real-world IP address of your -nearest nameserver.) +(But replace 129.16.1.3 with the actual real-world IP address of +your nearest nameserver.) +
Now, host lookups should work:
@@ -309,33 +264,20 @@ uucp-gw-2.pa.dec.com 172799 IN A 204.123.2.19-To transfer files via UDP, you can use the tftp program. - -
- # tftp 12.34.56.78 - tftp> get filename - Received XXXXXX bytes in X.X seconds - tftp> quit - # -- -or, to do it non-interactively (with ugly output): - -
- # echo get filename | tftp 12.34.56.78 - tftp> Received XXXXXX bytes in X.X seconds - tftp> # -+
+At this point, UDP and TCP should (mostly) work. -This, of course, requires that you have put the file filename in -the root directory of the tftp server (12.34.56.78). +
+Here is an example of how to configure a server machine and an emulated +client machine for sharing files via NFS:
-It is also possible to run NFS via UDP. This is very useful if you want to -share entire directory trees between the emulated environment and another -machine. These instruction will work for FreeBSD, if you are running -something else, use your imagination to modify them: +(This is very useful if you want to share entire directory trees +between the emulated environment and another machine. These instruction +will work for FreeBSD, if you are running something else, use your +imagination to modify them.) +
@@ -374,10 +316,11 @@ files in both directions, but then you should be aware of the fragmentation issue mentioned above. --TCP is implemented to some extent, but should not be considered to be -stable yet. It is enough to let NetBSD/pmax and OpenBSD/pmax install via -ftp, though. +
TODO: Write a section on how to connect multiple emulator instances. +(Using the local_port and add_remote configuration file +commands.) + + @@ -386,27 +329,25 @@
Emulation of hardware devices
-Each file in the device/ directory is responsible for one hardware device. -These are used from src/machine.c, when initializing which hardware a -particular machine model will be using, or when adding devices to a -machine using the device() command in configuration files. +Each file in the src/device/ directory is responsible for one +hardware device. These are used from src/machine.c, when +initializing which hardware a particular machine model will be using, or +when adding devices to a machine using the device() command in +configuration files. --NOTE: 2005-02-26: I'm currently rewriting the -device registry subsystem. +
NOTE: The device registry subsystem is currently +in a state of flux, as it is being redesigned. -
-(I'll be using the name 'foo' as the name of the device in all these -examples. This is pseudo code, it might need some modification to +
(I'll be using the name "foo" as the name of the device in all +these examples. This is pseudo code, it might need some modification to actually compile and run.) -
-Each device should have the following: +
Each device should have the following:
/* * devinit_foo(): @@ -443,7 +384,8 @@ }
struct foo_data { int irq_nr; @@ -451,9 +393,9 @@ }
#define FOO_TICKSHIFT 10 @@ -498,7 +440,7 @@
-The return value of the access function has until 20040702 been a +The return value of the access function has until 2004-07-02 been a true/false value; 1 for success, or 0 for device access failure. A device access failure (on MIPS) will result in a DBE exception. @@ -512,78 +454,9 @@
To be compatible with pre-20040702 devices, a return value of 0 is treated -by the caller (in src/memory.c) as a value of -1. - - - +by the caller (in src/memory_rw.c) as a value of -1. -
-NOTE: The regression testing framework is basically just a skeleton so far. -Regression tests are very good to have. However, the fact that complete -operating systems can run in the emulator indicate that the emulation is -probably not too incorrect. This makes it less of a priority to write -regression tests. - -
-To run all the regression tests, type make regtest. Each assembly -language file matching the pattern test_*.S will be compiled and -linked into a 64-bit MIPS ELF (using a gcc cross compiler), and run in the -emulator. If everything goes well, you should see something like this: - -
- $ make regtest - cd tests; make run_tests; cd .. - gcc33 -Wall -fomit-frame-pointer -fmove-all-movables -fpeephole -O2 - -mcpu=ev5 -I/usr/X11R6/include -lm -L/usr/X11R6/lib -lX11 do_tests.c - -o do_tests - do_tests.c: In function `main': - do_tests.c:173: warning: unused variable `s' - /var/tmp//ccFOupvD.o: In function `do_tests': - /var/tmp//ccFOupvD.o(.text+0x3a8): warning: tmpnam() possibly used - unsafely; consider using mkstemp() - mips64-unknown-elf-gcc -g -O3 -fno-builtin -fschedule-insns -mips64 - -mabi=64 test_common.c -c -o test_common.o - ./do_tests "mips64-unknown-elf-gcc -g -O3 -fno-builtin -fschedule-insns - -mips64 -mabi=64" "mips64-unknown-elf-as -mabi=64 -mips64" - "mips64-unknown-elf-ld -Ttext 0xa800000000030000 -e main - --oformat=elf64-bigmips" "../gxemul" - - Starting tests: - test_addu.S (-a) - test_addu.S (-a -b) - test_clo_clz.S (-a) - test_clo_clz.S (-a -b) - .. - test_unaligned.S (-a) - test_unaligned.S (-a -b) - - Done. (12 tests done) - PASS: 12 - FAIL: 0 - - ---------------- - - All tests OK - - ---------------- -- -
-Each test writes output to stdout, and there is a test_*.good for -each .S file which contains the wanted output. If the actual output -matches the .good file, then the test passes, otherwise it fails. - -
-Read tests/README for more information.