4 |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
5 |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
6 |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
7 |
<b>Gavare's eXperimental Emulator: </b></font> |
<b>Gavare's eXperimental Emulator:</b></font><br> |
8 |
<font color="#000000" size="6"><b>Technical details</b> |
<font color="#000000" size="6"><b>Technical details</b> |
9 |
</font></td></tr></table></td></tr></table><p> |
</font></td></tr></table></td></tr></table><p> |
10 |
|
|
11 |
<!-- |
<!-- |
12 |
|
|
13 |
$Id: technical.html,v 1.62 2005/08/16 05:15:24 debug Exp $ |
$Id: technical.html,v 1.74 2006/06/17 10:16:22 debug Exp $ |
14 |
|
|
15 |
Copyright (C) 2004-2005 Anders Gavare. All rights reserved. |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
16 |
|
|
17 |
Redistribution and use in source and binary forms, with or without |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
modification, are permitted provided that the following conditions are met: |
64 |
<a name="speed"></a> |
<a name="speed"></a> |
65 |
<h3>Speed and emulation modes</h3> |
<h3>Speed and emulation modes</h3> |
66 |
|
|
67 |
So, how fast is GXemul? There is no good answer to this. There is |
So, how fast is GXemul? There is no short answer to this. There is |
68 |
especially no answer to the question <b>What is the slowdown factor?</b>, |
especially no answer to the question <b>What is the slowdown factor?</b>, |
69 |
because the host architecture and emulated architecture can usually not be |
because the host architecture and emulated architecture can usually not be |
70 |
compared just like that. |
compared just like that. |
71 |
|
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, host clock speed, which compiler and compiler flags |
host architecture, target architecture, host clock speed, which compiler |
74 |
were used to build the emulator, what the workload is, and so on. For |
and compiler flags were used to build the emulator, what the workload is, |
75 |
example, if an emulated operating system tries to read a block from disk, |
what additional runtime flags are given to the emulator, and so on. |
76 |
from its point of view the read was instantaneous (no waiting). So 1 MIPS |
|
77 |
in an emulated OS might have taken more than one million instructions on a |
<p>Devices are generally not timing-accurate: for example, if an emulated |
78 |
real machine. |
operating system tries to read a block from disk, from its point of view |
79 |
|
the read was instantaneous (no waiting). So 1 MIPS in an emulated OS might |
80 |
|
have taken more than one million instructions on a real machine. |
81 |
|
|
82 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
<p>Also, if the emulator says it has executed 1 million instructions, and |
83 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
the CPU family in question was capable of scalar execution (i.e. one cycle |
87 |
|
|
88 |
<p>Because of these issues, it is in my opinion best to measure |
<p>Because of these issues, it is in my opinion best to measure |
89 |
performance as the actual (real-world) time it takes to perform a task |
performance as the actual (real-world) time it takes to perform a task |
90 |
with the emulator. Typical examples would be "How long does it take to |
with the emulator, e.g.: |
|
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD |
|
|
in the emulator?". |
|
|
|
|
|
<p>The emulation technique used varies depending on which processor type |
|
|
is being emulated. (One of my main goals with GXemul is to experiment with |
|
|
different kinds of emulation, so these might change in the future.) |
|
91 |
|
|
92 |
<ul> |
<ul> |
93 |
<li><b>MIPS:</b><br> |
<li>"How long does it take to install NetBSD onto a disk image?" |
94 |
There are two emulation modes. The most important one is an |
<li>"How long does it take to compile XYZ inside NetBSD |
95 |
implementation of a <i>dynamic binary translator</i>. |
in the emulator?". |
|
(Compared to real binary translators, though, GXemul's bintrans |
|
|
subsystem is very simple and does not perform very well.) |
|
|
This mode can be used on Alpha and i386 host. The other emulation |
|
|
mode is simple interpretation, where an instruction is read from |
|
|
emulated memory, and interpreted one-at-a-time. (Slow, but it |
|
|
works. It can be forcefully used by using the <tt>-B</tt> command |
|
|
line option.) |
|
|
<p> |
|
|
<li><b>All other modes:</b><br> |
|
|
These are under development, using a new dynamic translation |
|
|
system. This system does not use host-specific backends. |
|
|
Speed is slower than real binary translation, but faster than |
|
|
traditional interpretation, and with some tricks it will hopefully |
|
|
still give reasonable speed. These modes don't really work yet, |
|
|
and are not enabled by default in the stable release. |
|
96 |
</ul> |
</ul> |
97 |
|
|
98 |
|
<p>So, how fast is it? :-) Answer: it varies. |
99 |
|
|
100 |
|
|
101 |
|
|
102 |
|
|
103 |
|
|
300 |
files in both directions, but then you should be aware of the |
files in both directions, but then you should be aware of the |
301 |
fragmentation issue mentioned above. |
fragmentation issue mentioned above. |
302 |
|
|
|
<p>TODO: Write a section on how to connect multiple emulator instances. |
|
|
(Using the <tt>local_port</tt> and <tt>add_remote</tt> configuration file |
|
|
commands.) |
|
303 |
|
|
304 |
|
|
305 |
|
|
310 |
<a name="devices"></a> |
<a name="devices"></a> |
311 |
<h3>Emulation of hardware devices</h3> |
<h3>Emulation of hardware devices</h3> |
312 |
|
|
313 |
Each file in the <tt>src/device/</tt> directory is responsible for one |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
314 |
hardware device. These are used from <tt>src/machine.c</tt>, when |
responsible for one hardware device. These are used from |
315 |
initializing which hardware a particular machine model will be using, or |
<tt>src/machines/machine_*.c</tt>, when initializing which hardware a particular |
316 |
when adding devices to a machine using the <tt>device()</tt> command in |
machine model will be using, or when adding devices to a machine using the |
317 |
configuration files. |
<tt>device()</tt> command in configuration files. |
|
|
|
|
<p><font color="#ff0000">NOTE: The device registry subsystem is currently |
|
|
in a state of flux, as it is being redesigned.</font> |
|
318 |
|
|
319 |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
320 |
these examples. This is pseudo code, it might need some modification to |
these examples. This is pseudo code, it might need some modification to |
327 |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
328 |
would typically look something like this: |
would typically look something like this: |
329 |
<pre> |
<pre> |
330 |
/* |
DEVINIT(foo) |
|
* devinit_foo(): |
|
|
*/ |
|
|
int devinit_foo(struct devinit *devinit) |
|
331 |
{ |
{ |
332 |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
333 |
|
|
335 |
fprintf(stderr, "out of memory\n"); |
fprintf(stderr, "out of memory\n"); |
336 |
exit(1); |
exit(1); |
337 |
} |
} |
338 |
memset(d, 0, sizeof(struct foon_data)); |
memset(d, 0, sizeof(struct foo_data)); |
339 |
|
|
340 |
/* |
/* |
341 |
* Set up stuff here, for example fill d with useful |
* Set up stuff here, for example fill d with useful |
347 |
|
|
348 |
memory_device_register(devinit->machine->memory, devinit->name, |
memory_device_register(devinit->machine->memory, devinit->name, |
349 |
devinit->addr, DEV_FOO_LENGTH, |
devinit->addr, DEV_FOO_LENGTH, |
350 |
dev_foo_access, (void *)d, MEM_DEFAULT, NULL); |
dev_foo_access, (void *)d, DM_DEFAULT, NULL); |
351 |
|
|
352 |
/* This should only be here if the device |
/* This should only be here if the device |
353 |
has a tick function: */ |
has a tick function: */ |
359 |
} |
} |
360 |
</pre><br> |
</pre><br> |
361 |
|
|
362 |
|
<p><tt>DEVINIT(foo)</tt> is defined as <tt>int devinit_foo(struct devinit *devinit)</tt>, |
363 |
|
and the <tt>devinit</tt> argument contains everything that the device driver's |
364 |
|
initialization function needs. |
365 |
|
|
366 |
|
<p> |
367 |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
368 |
should be defined. |
should be defined. |
369 |
<pre> |
<pre> |
372 |
/* ... */ |
/* ... */ |
373 |
} |
} |
374 |
</pre><br> |
</pre><br> |
375 |
|
(There is an exception to this rule; ugly hacks which allow |
376 |
|
code in <tt>src/machine.c</tt> to use some structures makes it |
377 |
|
necessary to place the <tt>struct foo_data</tt> in |
378 |
|
<tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt> |
379 |
|
itself. This is useful for example for interrupt controllers.) |
380 |
|
<p> |
381 |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
382 |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
383 |
function need to be defined as well: |
function need to be defined as well: |
384 |
<pre> |
<pre> |
385 |
#define FOO_TICKSHIFT 10 |
#define FOO_TICKSHIFT 14 |
386 |
|
|
387 |
void dev_foo_tick(struct cpu *cpu, void *extra) |
void dev_foo_tick(struct cpu *cpu, void *extra) |
388 |
{ |
{ |
395 |
} |
} |
396 |
</pre><br> |
</pre><br> |
397 |
|
|
398 |
|
<li>Does this device belong to a standard bus? |
399 |
|
<ul> |
400 |
|
<li>If this device should be detectable as a PCI device, then |
401 |
|
glue code should be added to |
402 |
|
<tt>src/devices/bus_pci.c</tt>. |
403 |
|
<li>If this is a legacy ISA device which should be usable by |
404 |
|
any machine which has an ISA bus, then the device should |
405 |
|
be added to <tt>src/devices/bus_isa.c</tt>. |
406 |
|
</ul> |
407 |
|
<p> |
408 |
<li>And last but not least, the device should have an access function. |
<li>And last but not least, the device should have an access function. |
409 |
The access function is called whenever there is a load or store |
The access function is called whenever there is a load or store |
410 |
to an address which is in the device' memory mapped region. |
to an address which is in the device' memory mapped region. To |
411 |
<pre> |
simplify things a little, a macro <tt>DEVICE_ACCESS(x)</tt> |
412 |
int dev_foo_access(struct cpu *cpu, struct memory *mem, |
is expanded into<pre> |
413 |
|
int dev_x_access(struct cpu *cpu, struct memory *mem, |
414 |
uint64_t relative_addr, unsigned char *data, size_t len, |
uint64_t relative_addr, unsigned char *data, size_t len, |
415 |
int writeflag, void *extra) |
int writeflag, void *extra) |
416 |
|
</pre> The access function can look like this: |
417 |
|
<pre> |
418 |
|
DEVICE_ACCESS(foo) |
419 |
{ |
{ |
420 |
struct foo_data *d = extra; |
struct foo_data *d = extra; |
421 |
uint64_t idata = 0, odata = 0; |
uint64_t idata = 0, odata = 0; |