trunk/doc/technical.html

<html><head><title>Gavare's eXperimental Emulator:&nbsp;&nbsp;&nbsp;Technical details</title>
<meta name="robots" content="noarchive,nofollow,noindex"></head>
<body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000">
<table border=0 width=100% bgcolor="#d0d0d0"><tr>
<td width=100% align=center valign=center><table border=0 width=100%><tr>
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6">
<b>Gavare's eXperimental Emulator:&nbsp;&nbsp;&nbsp;</b></font>
<font color="#000000" size="6"><b>Technical details</b>
</font></td></tr></table></td></tr></table><p>

<!--

$Id: technical.html,v 1.67 2005/11/24 12:32:10 debug Exp $

Copyright (C) 2004-2005  Anders Gavare.  All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
   derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.

-->


<a href="./">Back to the index</a>

<p><br>
<h2>Technical details</h2>

<p>This page describes some of the internals of GXemul.

<p>
<ul>
  <li><a href="#speed">Speed and emulation modes</a>
  <li><a href="#net">Networking</a>
  <li><a href="#devices">Emulation of hardware devices</a>
</ul>


<p><br>
<a name="speed"></a>
<h3>Speed and emulation modes</h3>

So, how fast is GXemul? There is no short answer to this. There is 
especially no answer to the question <b>What is the slowdown factor?</b>, 
because the host architecture and emulated architecture can usually not be 
compared just like that.

<p>Performance depends on several factors, including (but not limited to)  
host architecture, host clock speed, which compiler and compiler flags
were used to build the emulator, what the workload is, and so on. For
example, if an emulated operating system tries to read a block from disk,
from its point of view the read was instantaneous (no waiting). So 1 MIPS
in an emulated OS might have taken more than one million instructions on a
real machine.

<p>Also, if the emulator says it has executed 1 million instructions, and 
the CPU family in question was capable of scalar execution (i.e. one cycle 
per instruction), it might still have taken more than 1 million cycles on 
a real machine because of cache misses and similar micro-architectural 
penalties that are not simulated by GXemul.

<p>Because of these issues, it is in my opinion best to measure
performance as the actual (real-world) time it takes to perform a task
with the emulator. Typical examples would be "How long does it take to 
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD 
in the emulator?".

<p>So, how fast is it? :-)&nbsp;&nbsp;&nbsp;Answer: it varies.

<p>The emulation technique used varies depending on which processor type 
is being emulated. (One of my main goals with GXemul is to experiment with 
different kinds of emulation, so these might change in the future.)

<ul>
  <li><b>MIPS:</b><br>
        There are two emulation modes. The most important one is an
        implementation of a <i>dynamic binary translator</i>.
        (Compared to real binary translators, though, GXemul's bintrans
        subsystem is very simple and does not perform very well.)
        This mode can be used on Alpha and i386 host. The other emulation
        mode is simple interpretation, where an instruction is read from
        emulated memory, and interpreted one-at-a-time. (Slow, but it
        works. It can be forcefully used by using the <tt>-B</tt> command
        line option.)
  <p>
  <li><b>All other modes:</b><br>
        These use a kind of dynamic translation system. (This system does
        not use host-specific backends, so it is not "recompilation" or
        anything like that.) Speed is slower than real binary translation,
        but faster than traditional interpretation, and with some tricks
        it will hopefully still give reasonable speed. The ARM and PowerPC
        emulation modes uses this kind of translation.
</ul>


<p><br>
<a name="net"></a>
<h3>Networking</h3>

<font color="#ff0000">NOTE/TODO: This section is very old and a bit
out of date.</font>

<p>Running an entire operating system under emulation is very interesting
in itself, but for several reasons, running a modern OS without access to
TCP/IP networking is a bit akward. Hence, I feel the need to implement
TCP/IP (networking) support in the emulator.

<p>
As far as I have understood it, there seems to be two different ways to go:

<ol>
  <li>Forward ethernet packets from the emulated ethernet controller to
        the host machine's ethernet controller, and capture incoming 
        packets on the host's controller, giving them back to the
        emulated OS. Characteristics are:
        <ul>
          <li>Requires <i>direct</i> access to the host's NIC, which
                means on most platforms that the emulator cannot be
                run as a normal user!
          <li>Reduced portability, as not every host operating system
                uses the same programming interface for dealing with
                hardware ethernet controllers directly.
          <li>When run on a switched network, it might be problematic to
                connect from the emulated OS to the OS running on the
                host, as packets sent out on the host's NIC are not
                received by itself. (?)
          <li>All specific networking protocols will be handled by the
                physical network.
        </ul>
  <p>
  or
  <p>
  <li>Whenever the emulated ethernet controller wishes to send a packet,
        the emulator looks at the packet and creates a response. Packets
        that can have an immediate response never go outside the emulator,
        other packet types have to be converted into suitable other
        connection types (UDP, TCP, etc). Characteristics:
        <ul>
          <li>Each packet type sent out on the emulated NIC must be handled.
                This means that I have to do a lot of coding.
                (I like this, because it gives me an opportunity to
                learn about networking protocols.)
          <li>By not relying on access to the host's NIC directly,
                portability is maintained. (It would be sad if the networking
                portion of a portable emulator isn't as portable as the
                rest of the emulator.)
          <li>The emulator can be run as a normal user process, does
                not require root privilegies.
          <li>Connecting from the emulated OS to the host's OS should
                not be problematic.
          <li>The emulated OS will experience the network just as a single
                machine behind a NAT gateway/firewall would. The emulated
                OS is thus automatically protected from the outside world.
        </ul>
</ol>

<p>
Some emulators/simulators use the first approach, while others use the 
second. I think that SIMH and QEMU are examples of emulators using the 
first and second approach, respectively.

<p>
Since I have choosen the second kind of implementation, I have to write 
support explicitly for any kind of network protocol that should be
supported. As of 2004-07-09, the following has been implemented and seems 
to work under at least NetBSD/pmax and OpenBSD/pmax under DECstation 5000/200
emulation (-E dec -e 3max):

<p>
<ul>
  <li>ARP requests sent out from the emulated NIC are interpreted,
        and converted to ARP responses. (This is used by the emulated OS
        to find out the MAC address of the gateway.)
  <li>ICMP echo requests (that is the kind of packet produced by the
        <b><tt>ping</tt></b> program) are interpreted and converted to ICMP echo
        replies, <i>regardless of the IP address</i>. This means that
        running ping from within the emulated OS will <i>always</i>
        receive a response. The ping packets never leave the emulated
        environment.
  <li>UDP packets are interpreted and passed along to the outside world.
        If the emulator receives an UDP packet from the outside world, it
        is converted into an UDP packet for the emulated OS. (This is not
        implemented very well yet, but seems to be enough for nameserver
        lookups, tftp file transfers, and NFS mounts using UDP.)
  <li>TCP packets are interpreted one at a time, similar to how UDP 
        packets are handled (but more state is kept for each connection).
        <font color="#ff0000">NOTE: Much of the TCP handling code is very
        ugly and hardcoded.</font>
<!--
  <li>RARP is not implemented yet. (I haven't needed it so far.)
-->
</ul>

<p>
The gateway machine, which is the only "other" machine that the emulated 
OS sees on its emulated network, works as a NAT-style firewall/gateway. It 
usually has a fixed IPv4 address of <tt>10.0.0.254</tt>. An OS running in 
the emulator would usually have an address of the form <tt>10.x.x.x</tt>;
a typical choice would be <tt>10.0.0.1</tt>.

<p>
Inside emulated NetBSD/pmax or OpenBSD/pmax, running the following 
commands should configure the emulated NIC:
<pre>
        # <b>ifconfig le0 10.0.0.1</b>
        # <b>route add default 10.0.0.254</b>
        add net default: gateway 10.0.0.254
</pre>

<p>
If you want nameserver lookups to work, you need a valid /etc/resolv.conf 
as well:
<pre>
        # <b>echo nameserver 129.16.1.3 > /etc/resolv.conf</b>
</pre>
(But replace <tt>129.16.1.3</tt> with the actual real-world IP address of 
your nearest nameserver.)

<p>
Now, host lookups should work:
<pre>
        # <b>host -a www.netbsd.org</b>
        Trying null domain
        rcode = 0 (Success), ancount=2
        The following answer is not authoritative:
        The following answer is not verified as authentic by the server:
        www.netbsd.org  86400 IN        AAAA    2001:4f8:4:7:290:27ff:feab:19a7
        www.netbsd.org  86400 IN        A       204.152.184.116
        For authoritative answers, see:
        netbsd.org      83627 IN        NS      uucp-gw-2.pa.dec.com
        netbsd.org      83627 IN        NS      ns.netbsd.org
        netbsd.org      83627 IN        NS      adns1.berkeley.edu
        netbsd.org      83627 IN        NS      adns2.berkeley.edu
        netbsd.org      83627 IN        NS      uucp-gw-1.pa.dec.com
        Additional information:
        ns.netbsd.org   83627 IN        A       204.152.184.164
        uucp-gw-1.pa.dec.com    172799 IN       A       204.123.2.18
        uucp-gw-2.pa.dec.com    172799 IN       A       204.123.2.19
</pre>

<p>
At this point, UDP and TCP should (mostly) work.

<p>
Here is an example of how to configure a server machine and an emulated 
client machine for sharing files via NFS:

<p>
(This is very useful if you want to share entire directory trees
between the emulated environment and another machine. These instruction
will work for FreeBSD, if you are running something else, use your
imagination to modify them.)

<p>
<ul>
  <li>On the server, add a line to your /etc/exports file, exporting
        the files you wish to use in the emulator:<pre>
        <b>/tftpboot -mapall=nobody -ro 123.11.22.33</b>
</pre>
        where 123.11.22.33 is the IP address of the machine running the
        emulator process, as seen from the outside world.
  <p>
  <li>Then start up the programs needed to serve NFS via UDP. Note the
        -n argument to mountd. This is needed to tell mountd to accept
        connections from unprivileged ports (because the emulator does
        not need to run as root).<pre>
        # <b>portmap</b>
        # <b>nfsd -u</b>       &lt;--- u for UDP
        # <b>mountd -n</b>
</pre>
  <li>In the guest OS in the emulator, once you have ethernet and IPv4
        configured so that you can use UDP, mounting the filesystem
        should now be possible:  (this example is for NetBSD/pmax
        or OpenBSD/pmax)<pre>
        # <b>mount -o ro,-r=1024,-w=1024,-U,-3 my.server.com:/tftpboot /mnt</b>
    or
        # <b>mount my.server.com:/tftpboot /mnt</b>
</pre>
        If you don't supply the read and write sizes, there is a risk
        that the default values are too large. The emulator currently
        does not handle fragmentation/defragmentation of <i>outgoing</i>
        packets, so going above the ethernet frame size (1518) is a very
        bad idea. Incoming packets (reading from nfs) should work, though,
        for example during an NFS install.
</ul>

The example above uses read-only mounts. That is enough for things like
letting NetBSD/pmax or OpenBSD/pmax install via NFS, without the need for
a CDROM ISO image. You can use a read-write mount if you wish to share
files in both directions, but then you should be aware of the 
fragmentation issue mentioned above.


<p><br>
<a name="devices"></a>
<h3>Emulation of hardware devices</h3>

Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is
responsible for one hardware device. These are used from
<tt>src/machine.c</tt>, when initializing which hardware a particular
machine model will be using, or when adding devices to a machine using the
<tt>device()</tt> command in configuration files.

<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all
these examples.  This is pseudo code, it might need some modification to
actually compile and run.)

<p>Each device should have the following:

<p>
<ul>
  <li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It
        would typically look something like this:
<pre>
        /*
         *  devinit_foo():
         */
        int devinit_foo(struct devinit *devinit)
        {
                struct foo_data *d = malloc(sizeof(struct foo_data));

                if (d == NULL) {
                        fprintf(stderr, "out of memory\n");
                        exit(1);
                }
                memset(d, 0, sizeof(struct foon_data));

                /*
                 *  Set up stuff here, for example fill d with useful
                 *  data. devinit contains settings like address, irq_nr,
                 *  and other things.
                 *
                 *  ...
                 */
        
                memory_device_register(devinit->machine->memory, devinit->name,
                    devinit->addr, DEV_FOO_LENGTH,
                    dev_foo_access, (void *)d, DM_DEFAULT, NULL);
        
                /*  This should only be here if the device
                    has a tick function:  */
                machine_add_tickfunction(machine, dev_foo_tick, d,
                    FOO_TICKSHIFT);

                /*  Return 1 if the device was successfully added.  */
                return 1;       
        }       
</pre><br>

  <li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct
        should be defined.
<pre>
        struct foo_data {
                int     irq_nr;
                /*  ...  */
        }
</pre><br>
        (There is an exception to this rule; ugly hacks which allow
        code in <tt>src/machine.c</tt> to use some structures makes it
        necessary to place the <tt>struct foo_data</tt> in
        <tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt>
        itself. This is useful for example for interrupt controllers.)
  <p>
  <li>If <tt>foo</tt> has a tick function (that is, something that needs to be
        run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick 
        function need to be defined as well:
<pre>
        #define FOO_TICKSHIFT           14

        void dev_foo_tick(struct cpu *cpu, void *extra)
        {
                struct foo_data *d = (struct foo_data *) extra;

                if (.....)
                        cpu_interrupt(cpu, d->irq_nr);
                else
                        cpu_interrupt_ack(cpu, d->irq_nr);
        }
</pre><br>

  <li>Does this device belong to a standard bus?
        <ul>
          <li>If this device should be detectable as a PCI device, then
                glue code should be added to
                <tt>src/devices/bus_pci.c</tt>.
          <li>If this is a legacy ISA device which should be usable by
                any machine which has an ISA bus, then the device should
                be added to <tt>src/devices/bus_isa.c</tt>.
        </ul>
  <p>
  <li>And last but not least, the device should have an access function.
        The access function is called whenever there is a load or store
        to an address which is in the device' memory mapped region.
<pre>
        int dev_foo_access(struct cpu *cpu, struct memory *mem,
            uint64_t relative_addr, unsigned char *data, size_t len,
            int writeflag, void *extra)
        {
                struct foo_data *d = extra;
                uint64_t idata = 0, odata = 0;

                idata = memory_readmax64(cpu, data, len);
                switch (relative_addr) {
                /* .... */
                }

                if (writeflag == MEM_READ)
                        memory_writemax64(cpu, data, len, odata);

                /*  Perhaps interrupts need to be asserted or
                    deasserted:  */
                dev_foo_tick(cpu, extra);

                /*  Return successfully.  */
                return 1;
        }
</pre><br>
</ul>

<p>
The return value of the access function has until 2004-07-02 been a 
true/false value; 1 for success, or 0 for device access failure. A device 
access failure (on MIPS) will result in a DBE exception.

<p>
Some devices are converted to support arbitrary memory latency
values. The return value is the number of cycles that the read or 
write access took. A value of 1 means one cycle, a value of 10 means 10 
cycles. Negative values are used for device access failures, and the 
absolute value of the value is then the number of cycles; a value of -5 
means that the access failed, and took 5 cycles.

<p>
To be compatible with pre-20040702 devices, a return value of 0 is treated 
by the caller (in <tt>src/memory_rw.c</tt>) as a value of -1.


</body>
</html>
1	<html><head><title>Gavare's eXperimental Emulator:   Technical details</title>
2	<meta name="robots" content="noarchive,nofollow,noindex"></head>
3	<body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000">
4	<table border=0 width=100% bgcolor="#d0d0d0"><tr>
5	<td width=100% align=center valign=center><table border=0 width=100%><tr>
6	<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6">
7	<b>Gavare's eXperimental Emulator:   </b></font>
8	<font color="#000000" size="6"><b>Technical details</b>
9	</font></td></tr></table></td></tr></table><p>
10
11	<!--
12
13	$Id: technical.html,v 1.67 2005/11/24 12:32:10 debug Exp $
14
15	Copyright (C) 2004-2005 Anders Gavare. All rights reserved.
16
17	Redistribution and use in source and binary forms, with or without
18	modification, are permitted provided that the following conditions are met:
19
20	1. Redistributions of source code must retain the above copyright
21	notice, this list of conditions and the following disclaimer.
22	2. Redistributions in binary form must reproduce the above copyright
23	notice, this list of conditions and the following disclaimer in the
24	documentation and/or other materials provided with the distribution.
25	3. The name of the author may not be used to endorse or promote products
26	derived from this software without specific prior written permission.
27
28	THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
29	ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
30	IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
31	ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
32	FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
33	DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
34	OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
35	HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
36	LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
37	OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
38	SUCH DAMAGE.
39
40	-->
41
42
43
44	<a href="./">Back to the index</a>
45
46	<p><br>
47	<h2>Technical details</h2>
48
49	<p>This page describes some of the internals of GXemul.
50
51	<p>
52	<ul>
53	<li><a href="#speed">Speed and emulation modes</a>
54	<li><a href="#net">Networking</a>
55	<li><a href="#devices">Emulation of hardware devices</a>
56	</ul>
57
58
59
60
61
62
63	<p><br>
64	<a name="speed"></a>
65	<h3>Speed and emulation modes</h3>
66
67	So, how fast is GXemul? There is no short answer to this. There is
68	especially no answer to the question <b>What is the slowdown factor?</b>,
69	because the host architecture and emulated architecture can usually not be
70	compared just like that.
71
72	<p>Performance depends on several factors, including (but not limited to)
73	host architecture, host clock speed, which compiler and compiler flags
74	were used to build the emulator, what the workload is, and so on. For
75	example, if an emulated operating system tries to read a block from disk,
76	from its point of view the read was instantaneous (no waiting). So 1 MIPS
77	in an emulated OS might have taken more than one million instructions on a
78	real machine.
79
80	<p>Also, if the emulator says it has executed 1 million instructions, and
81	the CPU family in question was capable of scalar execution (i.e. one cycle
82	per instruction), it might still have taken more than 1 million cycles on
83	a real machine because of cache misses and similar micro-architectural
84	penalties that are not simulated by GXemul.
85
86	<p>Because of these issues, it is in my opinion best to measure
87	performance as the actual (real-world) time it takes to perform a task
88	with the emulator. Typical examples would be "How long does it take to
89	install NetBSD?", or "How long does it take to compile XYZ inside NetBSD
90	in the emulator?".
91
92	<p>So, how fast is it? :-)   Answer: it varies.
93
94	<p>The emulation technique used varies depending on which processor type
95	is being emulated. (One of my main goals with GXemul is to experiment with
96	different kinds of emulation, so these might change in the future.)
97
98	<ul>
99	<li><b>MIPS:</b><br>
100	There are two emulation modes. The most important one is an
101	implementation of a <i>dynamic binary translator</i>.
102	(Compared to real binary translators, though, GXemul's bintrans
103	subsystem is very simple and does not perform very well.)
104	This mode can be used on Alpha and i386 host. The other emulation
105	mode is simple interpretation, where an instruction is read from
106	emulated memory, and interpreted one-at-a-time. (Slow, but it
107	works. It can be forcefully used by using the <tt>-B</tt> command
108	line option.)
109	<p>
110	<li><b>All other modes:</b><br>
111	These use a kind of dynamic translation system. (This system does
112	not use host-specific backends, so it is not "recompilation" or
113	anything like that.) Speed is slower than real binary translation,
114	but faster than traditional interpretation, and with some tricks
115	it will hopefully still give reasonable speed. The ARM and PowerPC
116	emulation modes uses this kind of translation.
117	</ul>
118
119
120
121
122
123
124	<p><br>
125	<a name="net"></a>
126	<h3>Networking</h3>
127
128	<font color="#ff0000">NOTE/TODO: This section is very old and a bit
129	out of date.</font>
130
131	<p>Running an entire operating system under emulation is very interesting
132	in itself, but for several reasons, running a modern OS without access to
133	TCP/IP networking is a bit akward. Hence, I feel the need to implement
134	TCP/IP (networking) support in the emulator.
135
136	<p>
137	As far as I have understood it, there seems to be two different ways to go:
138
139	<ol>
140	<li>Forward ethernet packets from the emulated ethernet controller to
141	the host machine's ethernet controller, and capture incoming
142	packets on the host's controller, giving them back to the
143	emulated OS. Characteristics are:
144	<ul>
145	<li>Requires <i>direct</i> access to the host's NIC, which
146	means on most platforms that the emulator cannot be
147	run as a normal user!
148	<li>Reduced portability, as not every host operating system
149	uses the same programming interface for dealing with
150	hardware ethernet controllers directly.
151	<li>When run on a switched network, it might be problematic to
152	connect from the emulated OS to the OS running on the
153	host, as packets sent out on the host's NIC are not
154	received by itself. (?)
155	<li>All specific networking protocols will be handled by the
156	physical network.
157	</ul>
158	<p>
159	or
160	<p>
161	<li>Whenever the emulated ethernet controller wishes to send a packet,
162	the emulator looks at the packet and creates a response. Packets
163	that can have an immediate response never go outside the emulator,
164	other packet types have to be converted into suitable other
165	connection types (UDP, TCP, etc). Characteristics:
166	<ul>
167	<li>Each packet type sent out on the emulated NIC must be handled.
168	This means that I have to do a lot of coding.
169	(I like this, because it gives me an opportunity to
170	learn about networking protocols.)
171	<li>By not relying on access to the host's NIC directly,
172	portability is maintained. (It would be sad if the networking
173	portion of a portable emulator isn't as portable as the
174	rest of the emulator.)
175	<li>The emulator can be run as a normal user process, does
176	not require root privilegies.
177	<li>Connecting from the emulated OS to the host's OS should
178	not be problematic.
179	<li>The emulated OS will experience the network just as a single
180	machine behind a NAT gateway/firewall would. The emulated
181	OS is thus automatically protected from the outside world.
182	</ul>
183	</ol>
184
185	<p>
186	Some emulators/simulators use the first approach, while others use the
187	second. I think that SIMH and QEMU are examples of emulators using the
188	first and second approach, respectively.
189
190	<p>
191	Since I have choosen the second kind of implementation, I have to write
192	support explicitly for any kind of network protocol that should be
193	supported. As of 2004-07-09, the following has been implemented and seems
194	to work under at least NetBSD/pmax and OpenBSD/pmax under DECstation 5000/200
195	emulation (-E dec -e 3max):
196
197	<p>
198	<ul>
199	<li>ARP requests sent out from the emulated NIC are interpreted,
200	and converted to ARP responses. (This is used by the emulated OS
201	to find out the MAC address of the gateway.)
202	<li>ICMP echo requests (that is the kind of packet produced by the
203	<b><tt>ping</tt></b> program) are interpreted and converted to ICMP echo
204	replies, <i>regardless of the IP address</i>. This means that
205	running ping from within the emulated OS will <i>always</i>
206	receive a response. The ping packets never leave the emulated
207	environment.
208	<li>UDP packets are interpreted and passed along to the outside world.
209	If the emulator receives an UDP packet from the outside world, it
210	is converted into an UDP packet for the emulated OS. (This is not
211	implemented very well yet, but seems to be enough for nameserver
212	lookups, tftp file transfers, and NFS mounts using UDP.)
213	<li>TCP packets are interpreted one at a time, similar to how UDP
214	packets are handled (but more state is kept for each connection).
215	<font color="#ff0000">NOTE: Much of the TCP handling code is very
216	ugly and hardcoded.</font>
217	<!--
218	<li>RARP is not implemented yet. (I haven't needed it so far.)
219	-->
220	</ul>
221
222	<p>
223	The gateway machine, which is the only "other" machine that the emulated
224	OS sees on its emulated network, works as a NAT-style firewall/gateway. It
225	usually has a fixed IPv4 address of <tt>10.0.0.254</tt>. An OS running in
226	the emulator would usually have an address of the form <tt>10.x.x.x</tt>;
227	a typical choice would be <tt>10.0.0.1</tt>.
228
229	<p>
230	Inside emulated NetBSD/pmax or OpenBSD/pmax, running the following
231	commands should configure the emulated NIC:
232	<pre>
233	# <b>ifconfig le0 10.0.0.1</b>
234	# <b>route add default 10.0.0.254</b>
235	add net default: gateway 10.0.0.254
236	</pre>
237
238	<p>
239	If you want nameserver lookups to work, you need a valid /etc/resolv.conf
240	as well:
241	<pre>
242	# <b>echo nameserver 129.16.1.3 > /etc/resolv.conf</b>
243	</pre>
244	(But replace <tt>129.16.1.3</tt> with the actual real-world IP address of
245	your nearest nameserver.)
246
247	<p>
248	Now, host lookups should work:
249	<pre>
250	# <b>host -a www.netbsd.org</b>
251	Trying null domain
252	rcode = 0 (Success), ancount=2
253	The following answer is not authoritative:
254	The following answer is not verified as authentic by the server:
255	www.netbsd.org 86400 IN AAAA 2001:4f8:4:7:290:27ff:feab:19a7
256	www.netbsd.org 86400 IN A 204.152.184.116
257	For authoritative answers, see:
258	netbsd.org 83627 IN NS uucp-gw-2.pa.dec.com
259	netbsd.org 83627 IN NS ns.netbsd.org
260	netbsd.org 83627 IN NS adns1.berkeley.edu
261	netbsd.org 83627 IN NS adns2.berkeley.edu
262	netbsd.org 83627 IN NS uucp-gw-1.pa.dec.com
263	Additional information:
264	ns.netbsd.org 83627 IN A 204.152.184.164
265	uucp-gw-1.pa.dec.com 172799 IN A 204.123.2.18
266	uucp-gw-2.pa.dec.com 172799 IN A 204.123.2.19
267	</pre>
268
269	<p>
270	At this point, UDP and TCP should (mostly) work.
271
272	<p>
273	Here is an example of how to configure a server machine and an emulated
274	client machine for sharing files via NFS:
275
276	<p>
277	(This is very useful if you want to share entire directory trees
278	between the emulated environment and another machine. These instruction
279	will work for FreeBSD, if you are running something else, use your
280	imagination to modify them.)
281
282	<p>
283	<ul>
284	<li>On the server, add a line to your /etc/exports file, exporting
285	the files you wish to use in the emulator:<pre>
286	<b>/tftpboot -mapall=nobody -ro 123.11.22.33</b>
287	</pre>
288	where 123.11.22.33 is the IP address of the machine running the
289	emulator process, as seen from the outside world.
290	<p>
291	<li>Then start up the programs needed to serve NFS via UDP. Note the
292	-n argument to mountd. This is needed to tell mountd to accept
293	connections from unprivileged ports (because the emulator does
294	not need to run as root).<pre>
295	# <b>portmap</b>
296	# <b>nfsd -u</b> <--- u for UDP
297	# <b>mountd -n</b>
298	</pre>
299	<li>In the guest OS in the emulator, once you have ethernet and IPv4
300	configured so that you can use UDP, mounting the filesystem
301	should now be possible: (this example is for NetBSD/pmax
302	or OpenBSD/pmax)<pre>
303	# <b>mount -o ro,-r=1024,-w=1024,-U,-3 my.server.com:/tftpboot /mnt</b>
304	or
305	# <b>mount my.server.com:/tftpboot /mnt</b>
306	</pre>
307	If you don't supply the read and write sizes, there is a risk
308	that the default values are too large. The emulator currently
309	does not handle fragmentation/defragmentation of <i>outgoing</i>
310	packets, so going above the ethernet frame size (1518) is a very
311	bad idea. Incoming packets (reading from nfs) should work, though,
312	for example during an NFS install.
313	</ul>
314
315	The example above uses read-only mounts. That is enough for things like
316	letting NetBSD/pmax or OpenBSD/pmax install via NFS, without the need for
317	a CDROM ISO image. You can use a read-write mount if you wish to share
318	files in both directions, but then you should be aware of the
319	fragmentation issue mentioned above.
320
321
322
323
324
325
326
327	<p><br>
328	<a name="devices"></a>
329	<h3>Emulation of hardware devices</h3>
330
331	Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is
332	responsible for one hardware device. These are used from
333	<tt>src/machine.c</tt>, when initializing which hardware a particular
334	machine model will be using, or when adding devices to a machine using the
335	<tt>device()</tt> command in configuration files.
336
337	<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all
338	these examples. This is pseudo code, it might need some modification to
339	actually compile and run.)
340
341	<p>Each device should have the following:
342
343	<p>
344	<ul>
345	<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It
346	would typically look something like this:
347	<pre>
348	/*
349	* devinit_foo():
350	*/
351	int devinit_foo(struct devinit *devinit)
352	{
353	struct foo_data *d = malloc(sizeof(struct foo_data));
354
355	if (d == NULL) {
356	fprintf(stderr, "out of memory\n");
357	exit(1);
358	}
359	memset(d, 0, sizeof(struct foon_data));
360
361	/*
362	* Set up stuff here, for example fill d with useful
363	* data. devinit contains settings like address, irq_nr,
364	* and other things.
365	*
366	* ...
367	*/
368
369	memory_device_register(devinit->machine->memory, devinit->name,
370	devinit->addr, DEV_FOO_LENGTH,
371	dev_foo_access, (void *)d, DM_DEFAULT, NULL);
372
373	/* This should only be here if the device
374	has a tick function: */
375	machine_add_tickfunction(machine, dev_foo_tick, d,
376	FOO_TICKSHIFT);
377
378	/* Return 1 if the device was successfully added. */
379	return 1;
380	}
381	</pre><br>
382
383	<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct
384	should be defined.
385	<pre>
386	struct foo_data {
387	int irq_nr;
388	/* ... */
389	}
390	</pre><br>
391	(There is an exception to this rule; ugly hacks which allow
392	code in <tt>src/machine.c</tt> to use some structures makes it
393	necessary to place the <tt>struct foo_data</tt> in
394	<tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt>
395	itself. This is useful for example for interrupt controllers.)
396	<p>
397	<li>If <tt>foo</tt> has a tick function (that is, something that needs to be
398	run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick
399	function need to be defined as well:
400	<pre>
401	#define FOO_TICKSHIFT 14
402
403	void dev_foo_tick(struct cpu cpu, void extra)
404	{
405	struct foo_data d = (struct foo_data ) extra;
406
407	if (.....)
408	cpu_interrupt(cpu, d->irq_nr);
409	else
410	cpu_interrupt_ack(cpu, d->irq_nr);
411	}
412	</pre><br>
413
414	<li>Does this device belong to a standard bus?
415	<ul>
416	<li>If this device should be detectable as a PCI device, then
417	glue code should be added to
418	<tt>src/devices/bus_pci.c</tt>.
419	<li>If this is a legacy ISA device which should be usable by
420	any machine which has an ISA bus, then the device should
421	be added to <tt>src/devices/bus_isa.c</tt>.
422	</ul>
423	<p>
424	<li>And last but not least, the device should have an access function.
425	The access function is called whenever there is a load or store
426	to an address which is in the device' memory mapped region.
427	<pre>
428	int dev_foo_access(struct cpu cpu, struct memory mem,
429	uint64_t relative_addr, unsigned char *data, size_t len,
430	int writeflag, void *extra)
431	{
432	struct foo_data *d = extra;
433	uint64_t idata = 0, odata = 0;
434
435	idata = memory_readmax64(cpu, data, len);
436	switch (relative_addr) {
437	/* .... */
438	}
439
440	if (writeflag == MEM_READ)
441	memory_writemax64(cpu, data, len, odata);
442
443	/* Perhaps interrupts need to be asserted or
444	deasserted: */
445	dev_foo_tick(cpu, extra);
446
447	/* Return successfully. */
448	return 1;
449	}
450	</pre><br>
451	</ul>
452
453	<p>
454	The return value of the access function has until 2004-07-02 been a
455	true/false value; 1 for success, or 0 for device access failure. A device
456	access failure (on MIPS) will result in a DBE exception.
457
458	<p>
459	Some devices are converted to support arbitrary memory latency
460	values. The return value is the number of cycles that the read or
461	write access took. A value of 1 means one cycle, a value of 10 means 10
462	cycles. Negative values are used for device access failures, and the
463	absolute value of the value is then the number of cycles; a value of -5
464	means that the access failed, and took 5 cycles.
465
466	<p>
467	To be compatible with pre-20040702 devices, a return value of 0 is treated
468	by the caller (in <tt>src/memory_rw.c</tt>) as a value of -1.
469
470
471
472
473
474
475	</body>
476	</html>