1 |
<html><head><title>Gavare's eXperimental Emulator: Technical details</title> |
2 |
<meta name="robots" content="noarchive,nofollow,noindex"></head> |
3 |
<body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000"> |
4 |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
5 |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
6 |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
7 |
<b>Gavare's eXperimental Emulator:</b></font><br> |
8 |
<font color="#000000" size="6"><b>Technical details</b> |
9 |
</font></td></tr></table></td></tr></table><p> |
10 |
|
11 |
<!-- |
12 |
|
13 |
$Id: technical.html,v 1.74 2006/06/17 10:16:22 debug Exp $ |
14 |
|
15 |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
16 |
|
17 |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
19 |
|
20 |
1. Redistributions of source code must retain the above copyright |
21 |
notice, this list of conditions and the following disclaimer. |
22 |
2. Redistributions in binary form must reproduce the above copyright |
23 |
notice, this list of conditions and the following disclaimer in the |
24 |
documentation and/or other materials provided with the distribution. |
25 |
3. The name of the author may not be used to endorse or promote products |
26 |
derived from this software without specific prior written permission. |
27 |
|
28 |
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND |
29 |
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
30 |
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
31 |
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE |
32 |
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
33 |
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
34 |
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
35 |
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
36 |
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
37 |
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
38 |
SUCH DAMAGE. |
39 |
|
40 |
--> |
41 |
|
42 |
|
43 |
|
44 |
<a href="./">Back to the index</a> |
45 |
|
46 |
<p><br> |
47 |
<h2>Technical details</h2> |
48 |
|
49 |
<p>This page describes some of the internals of GXemul. |
50 |
|
51 |
<p> |
52 |
<ul> |
53 |
<li><a href="#speed">Speed and emulation modes</a> |
54 |
<li><a href="#net">Networking</a> |
55 |
<li><a href="#devices">Emulation of hardware devices</a> |
56 |
</ul> |
57 |
|
58 |
|
59 |
|
60 |
|
61 |
|
62 |
|
63 |
<p><br> |
64 |
<a name="speed"></a> |
65 |
<h3>Speed and emulation modes</h3> |
66 |
|
67 |
So, how fast is GXemul? There is no short answer to this. There is |
68 |
especially no answer to the question <b>What is the slowdown factor?</b>, |
69 |
because the host architecture and emulated architecture can usually not be |
70 |
compared just like that. |
71 |
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, target architecture, host clock speed, which compiler |
74 |
and compiler flags were used to build the emulator, what the workload is, |
75 |
what additional runtime flags are given to the emulator, and so on. |
76 |
|
77 |
<p>Devices are generally not timing-accurate: for example, if an emulated |
78 |
operating system tries to read a block from disk, from its point of view |
79 |
the read was instantaneous (no waiting). So 1 MIPS in an emulated OS might |
80 |
have taken more than one million instructions on a real machine. |
81 |
|
82 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
83 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
84 |
per instruction), it might still have taken more than 1 million cycles on |
85 |
a real machine because of cache misses and similar micro-architectural |
86 |
penalties that are not simulated by GXemul. |
87 |
|
88 |
<p>Because of these issues, it is in my opinion best to measure |
89 |
performance as the actual (real-world) time it takes to perform a task |
90 |
with the emulator, e.g.: |
91 |
|
92 |
<ul> |
93 |
<li>"How long does it take to install NetBSD onto a disk image?" |
94 |
<li>"How long does it take to compile XYZ inside NetBSD |
95 |
in the emulator?". |
96 |
</ul> |
97 |
|
98 |
<p>So, how fast is it? :-) Answer: it varies. |
99 |
|
100 |
|
101 |
|
102 |
|
103 |
|
104 |
|
105 |
|
106 |
<p><br> |
107 |
<a name="net"></a> |
108 |
<h3>Networking</h3> |
109 |
|
110 |
<font color="#ff0000">NOTE/TODO: This section is very old and a bit |
111 |
out of date.</font> |
112 |
|
113 |
<p>Running an entire operating system under emulation is very interesting |
114 |
in itself, but for several reasons, running a modern OS without access to |
115 |
TCP/IP networking is a bit akward. Hence, I feel the need to implement |
116 |
TCP/IP (networking) support in the emulator. |
117 |
|
118 |
<p> |
119 |
As far as I have understood it, there seems to be two different ways to go: |
120 |
|
121 |
<ol> |
122 |
<li>Forward ethernet packets from the emulated ethernet controller to |
123 |
the host machine's ethernet controller, and capture incoming |
124 |
packets on the host's controller, giving them back to the |
125 |
emulated OS. Characteristics are: |
126 |
<ul> |
127 |
<li>Requires <i>direct</i> access to the host's NIC, which |
128 |
means on most platforms that the emulator cannot be |
129 |
run as a normal user! |
130 |
<li>Reduced portability, as not every host operating system |
131 |
uses the same programming interface for dealing with |
132 |
hardware ethernet controllers directly. |
133 |
<li>When run on a switched network, it might be problematic to |
134 |
connect from the emulated OS to the OS running on the |
135 |
host, as packets sent out on the host's NIC are not |
136 |
received by itself. (?) |
137 |
<li>All specific networking protocols will be handled by the |
138 |
physical network. |
139 |
</ul> |
140 |
<p> |
141 |
or |
142 |
<p> |
143 |
<li>Whenever the emulated ethernet controller wishes to send a packet, |
144 |
the emulator looks at the packet and creates a response. Packets |
145 |
that can have an immediate response never go outside the emulator, |
146 |
other packet types have to be converted into suitable other |
147 |
connection types (UDP, TCP, etc). Characteristics: |
148 |
<ul> |
149 |
<li>Each packet type sent out on the emulated NIC must be handled. |
150 |
This means that I have to do a lot of coding. |
151 |
(I like this, because it gives me an opportunity to |
152 |
learn about networking protocols.) |
153 |
<li>By not relying on access to the host's NIC directly, |
154 |
portability is maintained. (It would be sad if the networking |
155 |
portion of a portable emulator isn't as portable as the |
156 |
rest of the emulator.) |
157 |
<li>The emulator can be run as a normal user process, does |
158 |
not require root privilegies. |
159 |
<li>Connecting from the emulated OS to the host's OS should |
160 |
not be problematic. |
161 |
<li>The emulated OS will experience the network just as a single |
162 |
machine behind a NAT gateway/firewall would. The emulated |
163 |
OS is thus automatically protected from the outside world. |
164 |
</ul> |
165 |
</ol> |
166 |
|
167 |
<p> |
168 |
Some emulators/simulators use the first approach, while others use the |
169 |
second. I think that SIMH and QEMU are examples of emulators using the |
170 |
first and second approach, respectively. |
171 |
|
172 |
<p> |
173 |
Since I have choosen the second kind of implementation, I have to write |
174 |
support explicitly for any kind of network protocol that should be |
175 |
supported. As of 2004-07-09, the following has been implemented and seems |
176 |
to work under at least NetBSD/pmax and OpenBSD/pmax under DECstation 5000/200 |
177 |
emulation (-E dec -e 3max): |
178 |
|
179 |
<p> |
180 |
<ul> |
181 |
<li>ARP requests sent out from the emulated NIC are interpreted, |
182 |
and converted to ARP responses. (This is used by the emulated OS |
183 |
to find out the MAC address of the gateway.) |
184 |
<li>ICMP echo requests (that is the kind of packet produced by the |
185 |
<b><tt>ping</tt></b> program) are interpreted and converted to ICMP echo |
186 |
replies, <i>regardless of the IP address</i>. This means that |
187 |
running ping from within the emulated OS will <i>always</i> |
188 |
receive a response. The ping packets never leave the emulated |
189 |
environment. |
190 |
<li>UDP packets are interpreted and passed along to the outside world. |
191 |
If the emulator receives an UDP packet from the outside world, it |
192 |
is converted into an UDP packet for the emulated OS. (This is not |
193 |
implemented very well yet, but seems to be enough for nameserver |
194 |
lookups, tftp file transfers, and NFS mounts using UDP.) |
195 |
<li>TCP packets are interpreted one at a time, similar to how UDP |
196 |
packets are handled (but more state is kept for each connection). |
197 |
<font color="#ff0000">NOTE: Much of the TCP handling code is very |
198 |
ugly and hardcoded.</font> |
199 |
<!-- |
200 |
<li>RARP is not implemented yet. (I haven't needed it so far.) |
201 |
--> |
202 |
</ul> |
203 |
|
204 |
<p> |
205 |
The gateway machine, which is the only "other" machine that the emulated |
206 |
OS sees on its emulated network, works as a NAT-style firewall/gateway. It |
207 |
usually has a fixed IPv4 address of <tt>10.0.0.254</tt>. An OS running in |
208 |
the emulator would usually have an address of the form <tt>10.x.x.x</tt>; |
209 |
a typical choice would be <tt>10.0.0.1</tt>. |
210 |
|
211 |
<p> |
212 |
Inside emulated NetBSD/pmax or OpenBSD/pmax, running the following |
213 |
commands should configure the emulated NIC: |
214 |
<pre> |
215 |
# <b>ifconfig le0 10.0.0.1</b> |
216 |
# <b>route add default 10.0.0.254</b> |
217 |
add net default: gateway 10.0.0.254 |
218 |
</pre> |
219 |
|
220 |
<p> |
221 |
If you want nameserver lookups to work, you need a valid /etc/resolv.conf |
222 |
as well: |
223 |
<pre> |
224 |
# <b>echo nameserver 129.16.1.3 > /etc/resolv.conf</b> |
225 |
</pre> |
226 |
(But replace <tt>129.16.1.3</tt> with the actual real-world IP address of |
227 |
your nearest nameserver.) |
228 |
|
229 |
<p> |
230 |
Now, host lookups should work: |
231 |
<pre> |
232 |
# <b>host -a www.netbsd.org</b> |
233 |
Trying null domain |
234 |
rcode = 0 (Success), ancount=2 |
235 |
The following answer is not authoritative: |
236 |
The following answer is not verified as authentic by the server: |
237 |
www.netbsd.org 86400 IN AAAA 2001:4f8:4:7:290:27ff:feab:19a7 |
238 |
www.netbsd.org 86400 IN A 204.152.184.116 |
239 |
For authoritative answers, see: |
240 |
netbsd.org 83627 IN NS uucp-gw-2.pa.dec.com |
241 |
netbsd.org 83627 IN NS ns.netbsd.org |
242 |
netbsd.org 83627 IN NS adns1.berkeley.edu |
243 |
netbsd.org 83627 IN NS adns2.berkeley.edu |
244 |
netbsd.org 83627 IN NS uucp-gw-1.pa.dec.com |
245 |
Additional information: |
246 |
ns.netbsd.org 83627 IN A 204.152.184.164 |
247 |
uucp-gw-1.pa.dec.com 172799 IN A 204.123.2.18 |
248 |
uucp-gw-2.pa.dec.com 172799 IN A 204.123.2.19 |
249 |
</pre> |
250 |
|
251 |
<p> |
252 |
At this point, UDP and TCP should (mostly) work. |
253 |
|
254 |
<p> |
255 |
Here is an example of how to configure a server machine and an emulated |
256 |
client machine for sharing files via NFS: |
257 |
|
258 |
<p> |
259 |
(This is very useful if you want to share entire directory trees |
260 |
between the emulated environment and another machine. These instruction |
261 |
will work for FreeBSD, if you are running something else, use your |
262 |
imagination to modify them.) |
263 |
|
264 |
<p> |
265 |
<ul> |
266 |
<li>On the server, add a line to your /etc/exports file, exporting |
267 |
the files you wish to use in the emulator:<pre> |
268 |
<b>/tftpboot -mapall=nobody -ro 123.11.22.33</b> |
269 |
</pre> |
270 |
where 123.11.22.33 is the IP address of the machine running the |
271 |
emulator process, as seen from the outside world. |
272 |
<p> |
273 |
<li>Then start up the programs needed to serve NFS via UDP. Note the |
274 |
-n argument to mountd. This is needed to tell mountd to accept |
275 |
connections from unprivileged ports (because the emulator does |
276 |
not need to run as root).<pre> |
277 |
# <b>portmap</b> |
278 |
# <b>nfsd -u</b> <--- u for UDP |
279 |
# <b>mountd -n</b> |
280 |
</pre> |
281 |
<li>In the guest OS in the emulator, once you have ethernet and IPv4 |
282 |
configured so that you can use UDP, mounting the filesystem |
283 |
should now be possible: (this example is for NetBSD/pmax |
284 |
or OpenBSD/pmax)<pre> |
285 |
# <b>mount -o ro,-r=1024,-w=1024,-U,-3 my.server.com:/tftpboot /mnt</b> |
286 |
or |
287 |
# <b>mount my.server.com:/tftpboot /mnt</b> |
288 |
</pre> |
289 |
If you don't supply the read and write sizes, there is a risk |
290 |
that the default values are too large. The emulator currently |
291 |
does not handle fragmentation/defragmentation of <i>outgoing</i> |
292 |
packets, so going above the ethernet frame size (1518) is a very |
293 |
bad idea. Incoming packets (reading from nfs) should work, though, |
294 |
for example during an NFS install. |
295 |
</ul> |
296 |
|
297 |
The example above uses read-only mounts. That is enough for things like |
298 |
letting NetBSD/pmax or OpenBSD/pmax install via NFS, without the need for |
299 |
a CDROM ISO image. You can use a read-write mount if you wish to share |
300 |
files in both directions, but then you should be aware of the |
301 |
fragmentation issue mentioned above. |
302 |
|
303 |
|
304 |
|
305 |
|
306 |
|
307 |
|
308 |
|
309 |
<p><br> |
310 |
<a name="devices"></a> |
311 |
<h3>Emulation of hardware devices</h3> |
312 |
|
313 |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
314 |
responsible for one hardware device. These are used from |
315 |
<tt>src/machines/machine_*.c</tt>, when initializing which hardware a particular |
316 |
machine model will be using, or when adding devices to a machine using the |
317 |
<tt>device()</tt> command in configuration files. |
318 |
|
319 |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
320 |
these examples. This is pseudo code, it might need some modification to |
321 |
actually compile and run.) |
322 |
|
323 |
<p>Each device should have the following: |
324 |
|
325 |
<p> |
326 |
<ul> |
327 |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
328 |
would typically look something like this: |
329 |
<pre> |
330 |
DEVINIT(foo) |
331 |
{ |
332 |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
333 |
|
334 |
if (d == NULL) { |
335 |
fprintf(stderr, "out of memory\n"); |
336 |
exit(1); |
337 |
} |
338 |
memset(d, 0, sizeof(struct foo_data)); |
339 |
|
340 |
/* |
341 |
* Set up stuff here, for example fill d with useful |
342 |
* data. devinit contains settings like address, irq_nr, |
343 |
* and other things. |
344 |
* |
345 |
* ... |
346 |
*/ |
347 |
|
348 |
memory_device_register(devinit->machine->memory, devinit->name, |
349 |
devinit->addr, DEV_FOO_LENGTH, |
350 |
dev_foo_access, (void *)d, DM_DEFAULT, NULL); |
351 |
|
352 |
/* This should only be here if the device |
353 |
has a tick function: */ |
354 |
machine_add_tickfunction(machine, dev_foo_tick, d, |
355 |
FOO_TICKSHIFT); |
356 |
|
357 |
/* Return 1 if the device was successfully added. */ |
358 |
return 1; |
359 |
} |
360 |
</pre><br> |
361 |
|
362 |
<p><tt>DEVINIT(foo)</tt> is defined as <tt>int devinit_foo(struct devinit *devinit)</tt>, |
363 |
and the <tt>devinit</tt> argument contains everything that the device driver's |
364 |
initialization function needs. |
365 |
|
366 |
<p> |
367 |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
368 |
should be defined. |
369 |
<pre> |
370 |
struct foo_data { |
371 |
int irq_nr; |
372 |
/* ... */ |
373 |
} |
374 |
</pre><br> |
375 |
(There is an exception to this rule; ugly hacks which allow |
376 |
code in <tt>src/machine.c</tt> to use some structures makes it |
377 |
necessary to place the <tt>struct foo_data</tt> in |
378 |
<tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt> |
379 |
itself. This is useful for example for interrupt controllers.) |
380 |
<p> |
381 |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
382 |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
383 |
function need to be defined as well: |
384 |
<pre> |
385 |
#define FOO_TICKSHIFT 14 |
386 |
|
387 |
void dev_foo_tick(struct cpu *cpu, void *extra) |
388 |
{ |
389 |
struct foo_data *d = (struct foo_data *) extra; |
390 |
|
391 |
if (.....) |
392 |
cpu_interrupt(cpu, d->irq_nr); |
393 |
else |
394 |
cpu_interrupt_ack(cpu, d->irq_nr); |
395 |
} |
396 |
</pre><br> |
397 |
|
398 |
<li>Does this device belong to a standard bus? |
399 |
<ul> |
400 |
<li>If this device should be detectable as a PCI device, then |
401 |
glue code should be added to |
402 |
<tt>src/devices/bus_pci.c</tt>. |
403 |
<li>If this is a legacy ISA device which should be usable by |
404 |
any machine which has an ISA bus, then the device should |
405 |
be added to <tt>src/devices/bus_isa.c</tt>. |
406 |
</ul> |
407 |
<p> |
408 |
<li>And last but not least, the device should have an access function. |
409 |
The access function is called whenever there is a load or store |
410 |
to an address which is in the device' memory mapped region. To |
411 |
simplify things a little, a macro <tt>DEVICE_ACCESS(x)</tt> |
412 |
is expanded into<pre> |
413 |
int dev_x_access(struct cpu *cpu, struct memory *mem, |
414 |
uint64_t relative_addr, unsigned char *data, size_t len, |
415 |
int writeflag, void *extra) |
416 |
</pre> The access function can look like this: |
417 |
<pre> |
418 |
DEVICE_ACCESS(foo) |
419 |
{ |
420 |
struct foo_data *d = extra; |
421 |
uint64_t idata = 0, odata = 0; |
422 |
|
423 |
idata = memory_readmax64(cpu, data, len); |
424 |
switch (relative_addr) { |
425 |
/* .... */ |
426 |
} |
427 |
|
428 |
if (writeflag == MEM_READ) |
429 |
memory_writemax64(cpu, data, len, odata); |
430 |
|
431 |
/* Perhaps interrupts need to be asserted or |
432 |
deasserted: */ |
433 |
dev_foo_tick(cpu, extra); |
434 |
|
435 |
/* Return successfully. */ |
436 |
return 1; |
437 |
} |
438 |
</pre><br> |
439 |
</ul> |
440 |
|
441 |
<p> |
442 |
The return value of the access function has until 2004-07-02 been a |
443 |
true/false value; 1 for success, or 0 for device access failure. A device |
444 |
access failure (on MIPS) will result in a DBE exception. |
445 |
|
446 |
<p> |
447 |
Some devices are converted to support arbitrary memory latency |
448 |
values. The return value is the number of cycles that the read or |
449 |
write access took. A value of 1 means one cycle, a value of 10 means 10 |
450 |
cycles. Negative values are used for device access failures, and the |
451 |
absolute value of the value is then the number of cycles; a value of -5 |
452 |
means that the access failed, and took 5 cycles. |
453 |
|
454 |
<p> |
455 |
To be compatible with pre-20040702 devices, a return value of 0 is treated |
456 |
by the caller (in <tt>src/memory_rw.c</tt>) as a value of -1. |
457 |
|
458 |
|
459 |
|
460 |
|
461 |
|
462 |
|
463 |
</body> |
464 |
</html> |