1 |
<html><head><title>Gavare's eXperimental Emulator: Technical details</title> |
2 |
<meta name="robots" content="noarchive,nofollow,noindex"></head> |
3 |
<body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000"> |
4 |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
5 |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
6 |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
7 |
<b>Gavare's eXperimental Emulator: </b></font> |
8 |
<font color="#000000" size="6"><b>Technical details</b> |
9 |
</font></td></tr></table></td></tr></table><p> |
10 |
|
11 |
<!-- |
12 |
|
13 |
$Id: technical.html,v 1.67 2005/11/24 12:32:10 debug Exp $ |
14 |
|
15 |
Copyright (C) 2004-2005 Anders Gavare. All rights reserved. |
16 |
|
17 |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
19 |
|
20 |
1. Redistributions of source code must retain the above copyright |
21 |
notice, this list of conditions and the following disclaimer. |
22 |
2. Redistributions in binary form must reproduce the above copyright |
23 |
notice, this list of conditions and the following disclaimer in the |
24 |
documentation and/or other materials provided with the distribution. |
25 |
3. The name of the author may not be used to endorse or promote products |
26 |
derived from this software without specific prior written permission. |
27 |
|
28 |
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND |
29 |
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
30 |
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
31 |
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE |
32 |
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
33 |
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
34 |
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
35 |
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
36 |
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
37 |
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
38 |
SUCH DAMAGE. |
39 |
|
40 |
--> |
41 |
|
42 |
|
43 |
|
44 |
<a href="./">Back to the index</a> |
45 |
|
46 |
<p><br> |
47 |
<h2>Technical details</h2> |
48 |
|
49 |
<p>This page describes some of the internals of GXemul. |
50 |
|
51 |
<p> |
52 |
<ul> |
53 |
<li><a href="#speed">Speed and emulation modes</a> |
54 |
<li><a href="#net">Networking</a> |
55 |
<li><a href="#devices">Emulation of hardware devices</a> |
56 |
</ul> |
57 |
|
58 |
|
59 |
|
60 |
|
61 |
|
62 |
|
63 |
<p><br> |
64 |
<a name="speed"></a> |
65 |
<h3>Speed and emulation modes</h3> |
66 |
|
67 |
So, how fast is GXemul? There is no short answer to this. There is |
68 |
especially no answer to the question <b>What is the slowdown factor?</b>, |
69 |
because the host architecture and emulated architecture can usually not be |
70 |
compared just like that. |
71 |
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, host clock speed, which compiler and compiler flags |
74 |
were used to build the emulator, what the workload is, and so on. For |
75 |
example, if an emulated operating system tries to read a block from disk, |
76 |
from its point of view the read was instantaneous (no waiting). So 1 MIPS |
77 |
in an emulated OS might have taken more than one million instructions on a |
78 |
real machine. |
79 |
|
80 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
81 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
82 |
per instruction), it might still have taken more than 1 million cycles on |
83 |
a real machine because of cache misses and similar micro-architectural |
84 |
penalties that are not simulated by GXemul. |
85 |
|
86 |
<p>Because of these issues, it is in my opinion best to measure |
87 |
performance as the actual (real-world) time it takes to perform a task |
88 |
with the emulator. Typical examples would be "How long does it take to |
89 |
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD |
90 |
in the emulator?". |
91 |
|
92 |
<p>So, how fast is it? :-) Answer: it varies. |
93 |
|
94 |
<p>The emulation technique used varies depending on which processor type |
95 |
is being emulated. (One of my main goals with GXemul is to experiment with |
96 |
different kinds of emulation, so these might change in the future.) |
97 |
|
98 |
<ul> |
99 |
<li><b>MIPS:</b><br> |
100 |
There are two emulation modes. The most important one is an |
101 |
implementation of a <i>dynamic binary translator</i>. |
102 |
(Compared to real binary translators, though, GXemul's bintrans |
103 |
subsystem is very simple and does not perform very well.) |
104 |
This mode can be used on Alpha and i386 host. The other emulation |
105 |
mode is simple interpretation, where an instruction is read from |
106 |
emulated memory, and interpreted one-at-a-time. (Slow, but it |
107 |
works. It can be forcefully used by using the <tt>-B</tt> command |
108 |
line option.) |
109 |
<p> |
110 |
<li><b>All other modes:</b><br> |
111 |
These use a kind of dynamic translation system. (This system does |
112 |
not use host-specific backends, so it is not "recompilation" or |
113 |
anything like that.) Speed is slower than real binary translation, |
114 |
but faster than traditional interpretation, and with some tricks |
115 |
it will hopefully still give reasonable speed. The ARM and PowerPC |
116 |
emulation modes uses this kind of translation. |
117 |
</ul> |
118 |
|
119 |
|
120 |
|
121 |
|
122 |
|
123 |
|
124 |
<p><br> |
125 |
<a name="net"></a> |
126 |
<h3>Networking</h3> |
127 |
|
128 |
<font color="#ff0000">NOTE/TODO: This section is very old and a bit |
129 |
out of date.</font> |
130 |
|
131 |
<p>Running an entire operating system under emulation is very interesting |
132 |
in itself, but for several reasons, running a modern OS without access to |
133 |
TCP/IP networking is a bit akward. Hence, I feel the need to implement |
134 |
TCP/IP (networking) support in the emulator. |
135 |
|
136 |
<p> |
137 |
As far as I have understood it, there seems to be two different ways to go: |
138 |
|
139 |
<ol> |
140 |
<li>Forward ethernet packets from the emulated ethernet controller to |
141 |
the host machine's ethernet controller, and capture incoming |
142 |
packets on the host's controller, giving them back to the |
143 |
emulated OS. Characteristics are: |
144 |
<ul> |
145 |
<li>Requires <i>direct</i> access to the host's NIC, which |
146 |
means on most platforms that the emulator cannot be |
147 |
run as a normal user! |
148 |
<li>Reduced portability, as not every host operating system |
149 |
uses the same programming interface for dealing with |
150 |
hardware ethernet controllers directly. |
151 |
<li>When run on a switched network, it might be problematic to |
152 |
connect from the emulated OS to the OS running on the |
153 |
host, as packets sent out on the host's NIC are not |
154 |
received by itself. (?) |
155 |
<li>All specific networking protocols will be handled by the |
156 |
physical network. |
157 |
</ul> |
158 |
<p> |
159 |
or |
160 |
<p> |
161 |
<li>Whenever the emulated ethernet controller wishes to send a packet, |
162 |
the emulator looks at the packet and creates a response. Packets |
163 |
that can have an immediate response never go outside the emulator, |
164 |
other packet types have to be converted into suitable other |
165 |
connection types (UDP, TCP, etc). Characteristics: |
166 |
<ul> |
167 |
<li>Each packet type sent out on the emulated NIC must be handled. |
168 |
This means that I have to do a lot of coding. |
169 |
(I like this, because it gives me an opportunity to |
170 |
learn about networking protocols.) |
171 |
<li>By not relying on access to the host's NIC directly, |
172 |
portability is maintained. (It would be sad if the networking |
173 |
portion of a portable emulator isn't as portable as the |
174 |
rest of the emulator.) |
175 |
<li>The emulator can be run as a normal user process, does |
176 |
not require root privilegies. |
177 |
<li>Connecting from the emulated OS to the host's OS should |
178 |
not be problematic. |
179 |
<li>The emulated OS will experience the network just as a single |
180 |
machine behind a NAT gateway/firewall would. The emulated |
181 |
OS is thus automatically protected from the outside world. |
182 |
</ul> |
183 |
</ol> |
184 |
|
185 |
<p> |
186 |
Some emulators/simulators use the first approach, while others use the |
187 |
second. I think that SIMH and QEMU are examples of emulators using the |
188 |
first and second approach, respectively. |
189 |
|
190 |
<p> |
191 |
Since I have choosen the second kind of implementation, I have to write |
192 |
support explicitly for any kind of network protocol that should be |
193 |
supported. As of 2004-07-09, the following has been implemented and seems |
194 |
to work under at least NetBSD/pmax and OpenBSD/pmax under DECstation 5000/200 |
195 |
emulation (-E dec -e 3max): |
196 |
|
197 |
<p> |
198 |
<ul> |
199 |
<li>ARP requests sent out from the emulated NIC are interpreted, |
200 |
and converted to ARP responses. (This is used by the emulated OS |
201 |
to find out the MAC address of the gateway.) |
202 |
<li>ICMP echo requests (that is the kind of packet produced by the |
203 |
<b><tt>ping</tt></b> program) are interpreted and converted to ICMP echo |
204 |
replies, <i>regardless of the IP address</i>. This means that |
205 |
running ping from within the emulated OS will <i>always</i> |
206 |
receive a response. The ping packets never leave the emulated |
207 |
environment. |
208 |
<li>UDP packets are interpreted and passed along to the outside world. |
209 |
If the emulator receives an UDP packet from the outside world, it |
210 |
is converted into an UDP packet for the emulated OS. (This is not |
211 |
implemented very well yet, but seems to be enough for nameserver |
212 |
lookups, tftp file transfers, and NFS mounts using UDP.) |
213 |
<li>TCP packets are interpreted one at a time, similar to how UDP |
214 |
packets are handled (but more state is kept for each connection). |
215 |
<font color="#ff0000">NOTE: Much of the TCP handling code is very |
216 |
ugly and hardcoded.</font> |
217 |
<!-- |
218 |
<li>RARP is not implemented yet. (I haven't needed it so far.) |
219 |
--> |
220 |
</ul> |
221 |
|
222 |
<p> |
223 |
The gateway machine, which is the only "other" machine that the emulated |
224 |
OS sees on its emulated network, works as a NAT-style firewall/gateway. It |
225 |
usually has a fixed IPv4 address of <tt>10.0.0.254</tt>. An OS running in |
226 |
the emulator would usually have an address of the form <tt>10.x.x.x</tt>; |
227 |
a typical choice would be <tt>10.0.0.1</tt>. |
228 |
|
229 |
<p> |
230 |
Inside emulated NetBSD/pmax or OpenBSD/pmax, running the following |
231 |
commands should configure the emulated NIC: |
232 |
<pre> |
233 |
# <b>ifconfig le0 10.0.0.1</b> |
234 |
# <b>route add default 10.0.0.254</b> |
235 |
add net default: gateway 10.0.0.254 |
236 |
</pre> |
237 |
|
238 |
<p> |
239 |
If you want nameserver lookups to work, you need a valid /etc/resolv.conf |
240 |
as well: |
241 |
<pre> |
242 |
# <b>echo nameserver 129.16.1.3 > /etc/resolv.conf</b> |
243 |
</pre> |
244 |
(But replace <tt>129.16.1.3</tt> with the actual real-world IP address of |
245 |
your nearest nameserver.) |
246 |
|
247 |
<p> |
248 |
Now, host lookups should work: |
249 |
<pre> |
250 |
# <b>host -a www.netbsd.org</b> |
251 |
Trying null domain |
252 |
rcode = 0 (Success), ancount=2 |
253 |
The following answer is not authoritative: |
254 |
The following answer is not verified as authentic by the server: |
255 |
www.netbsd.org 86400 IN AAAA 2001:4f8:4:7:290:27ff:feab:19a7 |
256 |
www.netbsd.org 86400 IN A 204.152.184.116 |
257 |
For authoritative answers, see: |
258 |
netbsd.org 83627 IN NS uucp-gw-2.pa.dec.com |
259 |
netbsd.org 83627 IN NS ns.netbsd.org |
260 |
netbsd.org 83627 IN NS adns1.berkeley.edu |
261 |
netbsd.org 83627 IN NS adns2.berkeley.edu |
262 |
netbsd.org 83627 IN NS uucp-gw-1.pa.dec.com |
263 |
Additional information: |
264 |
ns.netbsd.org 83627 IN A 204.152.184.164 |
265 |
uucp-gw-1.pa.dec.com 172799 IN A 204.123.2.18 |
266 |
uucp-gw-2.pa.dec.com 172799 IN A 204.123.2.19 |
267 |
</pre> |
268 |
|
269 |
<p> |
270 |
At this point, UDP and TCP should (mostly) work. |
271 |
|
272 |
<p> |
273 |
Here is an example of how to configure a server machine and an emulated |
274 |
client machine for sharing files via NFS: |
275 |
|
276 |
<p> |
277 |
(This is very useful if you want to share entire directory trees |
278 |
between the emulated environment and another machine. These instruction |
279 |
will work for FreeBSD, if you are running something else, use your |
280 |
imagination to modify them.) |
281 |
|
282 |
<p> |
283 |
<ul> |
284 |
<li>On the server, add a line to your /etc/exports file, exporting |
285 |
the files you wish to use in the emulator:<pre> |
286 |
<b>/tftpboot -mapall=nobody -ro 123.11.22.33</b> |
287 |
</pre> |
288 |
where 123.11.22.33 is the IP address of the machine running the |
289 |
emulator process, as seen from the outside world. |
290 |
<p> |
291 |
<li>Then start up the programs needed to serve NFS via UDP. Note the |
292 |
-n argument to mountd. This is needed to tell mountd to accept |
293 |
connections from unprivileged ports (because the emulator does |
294 |
not need to run as root).<pre> |
295 |
# <b>portmap</b> |
296 |
# <b>nfsd -u</b> <--- u for UDP |
297 |
# <b>mountd -n</b> |
298 |
</pre> |
299 |
<li>In the guest OS in the emulator, once you have ethernet and IPv4 |
300 |
configured so that you can use UDP, mounting the filesystem |
301 |
should now be possible: (this example is for NetBSD/pmax |
302 |
or OpenBSD/pmax)<pre> |
303 |
# <b>mount -o ro,-r=1024,-w=1024,-U,-3 my.server.com:/tftpboot /mnt</b> |
304 |
or |
305 |
# <b>mount my.server.com:/tftpboot /mnt</b> |
306 |
</pre> |
307 |
If you don't supply the read and write sizes, there is a risk |
308 |
that the default values are too large. The emulator currently |
309 |
does not handle fragmentation/defragmentation of <i>outgoing</i> |
310 |
packets, so going above the ethernet frame size (1518) is a very |
311 |
bad idea. Incoming packets (reading from nfs) should work, though, |
312 |
for example during an NFS install. |
313 |
</ul> |
314 |
|
315 |
The example above uses read-only mounts. That is enough for things like |
316 |
letting NetBSD/pmax or OpenBSD/pmax install via NFS, without the need for |
317 |
a CDROM ISO image. You can use a read-write mount if you wish to share |
318 |
files in both directions, but then you should be aware of the |
319 |
fragmentation issue mentioned above. |
320 |
|
321 |
|
322 |
|
323 |
|
324 |
|
325 |
|
326 |
|
327 |
<p><br> |
328 |
<a name="devices"></a> |
329 |
<h3>Emulation of hardware devices</h3> |
330 |
|
331 |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
332 |
responsible for one hardware device. These are used from |
333 |
<tt>src/machine.c</tt>, when initializing which hardware a particular |
334 |
machine model will be using, or when adding devices to a machine using the |
335 |
<tt>device()</tt> command in configuration files. |
336 |
|
337 |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
338 |
these examples. This is pseudo code, it might need some modification to |
339 |
actually compile and run.) |
340 |
|
341 |
<p>Each device should have the following: |
342 |
|
343 |
<p> |
344 |
<ul> |
345 |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
346 |
would typically look something like this: |
347 |
<pre> |
348 |
/* |
349 |
* devinit_foo(): |
350 |
*/ |
351 |
int devinit_foo(struct devinit *devinit) |
352 |
{ |
353 |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
354 |
|
355 |
if (d == NULL) { |
356 |
fprintf(stderr, "out of memory\n"); |
357 |
exit(1); |
358 |
} |
359 |
memset(d, 0, sizeof(struct foon_data)); |
360 |
|
361 |
/* |
362 |
* Set up stuff here, for example fill d with useful |
363 |
* data. devinit contains settings like address, irq_nr, |
364 |
* and other things. |
365 |
* |
366 |
* ... |
367 |
*/ |
368 |
|
369 |
memory_device_register(devinit->machine->memory, devinit->name, |
370 |
devinit->addr, DEV_FOO_LENGTH, |
371 |
dev_foo_access, (void *)d, DM_DEFAULT, NULL); |
372 |
|
373 |
/* This should only be here if the device |
374 |
has a tick function: */ |
375 |
machine_add_tickfunction(machine, dev_foo_tick, d, |
376 |
FOO_TICKSHIFT); |
377 |
|
378 |
/* Return 1 if the device was successfully added. */ |
379 |
return 1; |
380 |
} |
381 |
</pre><br> |
382 |
|
383 |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
384 |
should be defined. |
385 |
<pre> |
386 |
struct foo_data { |
387 |
int irq_nr; |
388 |
/* ... */ |
389 |
} |
390 |
</pre><br> |
391 |
(There is an exception to this rule; ugly hacks which allow |
392 |
code in <tt>src/machine.c</tt> to use some structures makes it |
393 |
necessary to place the <tt>struct foo_data</tt> in |
394 |
<tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt> |
395 |
itself. This is useful for example for interrupt controllers.) |
396 |
<p> |
397 |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
398 |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
399 |
function need to be defined as well: |
400 |
<pre> |
401 |
#define FOO_TICKSHIFT 14 |
402 |
|
403 |
void dev_foo_tick(struct cpu *cpu, void *extra) |
404 |
{ |
405 |
struct foo_data *d = (struct foo_data *) extra; |
406 |
|
407 |
if (.....) |
408 |
cpu_interrupt(cpu, d->irq_nr); |
409 |
else |
410 |
cpu_interrupt_ack(cpu, d->irq_nr); |
411 |
} |
412 |
</pre><br> |
413 |
|
414 |
<li>Does this device belong to a standard bus? |
415 |
<ul> |
416 |
<li>If this device should be detectable as a PCI device, then |
417 |
glue code should be added to |
418 |
<tt>src/devices/bus_pci.c</tt>. |
419 |
<li>If this is a legacy ISA device which should be usable by |
420 |
any machine which has an ISA bus, then the device should |
421 |
be added to <tt>src/devices/bus_isa.c</tt>. |
422 |
</ul> |
423 |
<p> |
424 |
<li>And last but not least, the device should have an access function. |
425 |
The access function is called whenever there is a load or store |
426 |
to an address which is in the device' memory mapped region. |
427 |
<pre> |
428 |
int dev_foo_access(struct cpu *cpu, struct memory *mem, |
429 |
uint64_t relative_addr, unsigned char *data, size_t len, |
430 |
int writeflag, void *extra) |
431 |
{ |
432 |
struct foo_data *d = extra; |
433 |
uint64_t idata = 0, odata = 0; |
434 |
|
435 |
idata = memory_readmax64(cpu, data, len); |
436 |
switch (relative_addr) { |
437 |
/* .... */ |
438 |
} |
439 |
|
440 |
if (writeflag == MEM_READ) |
441 |
memory_writemax64(cpu, data, len, odata); |
442 |
|
443 |
/* Perhaps interrupts need to be asserted or |
444 |
deasserted: */ |
445 |
dev_foo_tick(cpu, extra); |
446 |
|
447 |
/* Return successfully. */ |
448 |
return 1; |
449 |
} |
450 |
</pre><br> |
451 |
</ul> |
452 |
|
453 |
<p> |
454 |
The return value of the access function has until 2004-07-02 been a |
455 |
true/false value; 1 for success, or 0 for device access failure. A device |
456 |
access failure (on MIPS) will result in a DBE exception. |
457 |
|
458 |
<p> |
459 |
Some devices are converted to support arbitrary memory latency |
460 |
values. The return value is the number of cycles that the read or |
461 |
write access took. A value of 1 means one cycle, a value of 10 means 10 |
462 |
cycles. Negative values are used for device access failures, and the |
463 |
absolute value of the value is then the number of cycles; a value of -5 |
464 |
means that the access failed, and took 5 cycles. |
465 |
|
466 |
<p> |
467 |
To be compatible with pre-20040702 devices, a return value of 0 is treated |
468 |
by the caller (in <tt>src/memory_rw.c</tt>) as a value of -1. |
469 |
|
470 |
|
471 |
|
472 |
|
473 |
|
474 |
|
475 |
</body> |
476 |
</html> |