4 |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
<table border=0 width=100% bgcolor="#d0d0d0"><tr> |
5 |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
<td width=100% align=center valign=center><table border=0 width=100%><tr> |
6 |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
<td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6"> |
7 |
<b>Gavare's eXperimental Emulator: </b></font> |
<b>Gavare's eXperimental Emulator:</b></font><br> |
8 |
<font color="#000000" size="6"><b>Technical details</b> |
<font color="#000000" size="6"><b>Technical details</b> |
9 |
</font></td></tr></table></td></tr></table><p> |
</font></td></tr></table></td></tr></table><p> |
10 |
|
|
11 |
<!-- |
<!-- |
12 |
|
|
13 |
$Id: technical.html,v 1.67 2005/11/24 12:32:10 debug Exp $ |
$Id: technical.html,v 1.74 2006/06/17 10:16:22 debug Exp $ |
14 |
|
|
15 |
Copyright (C) 2004-2005 Anders Gavare. All rights reserved. |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
16 |
|
|
17 |
Redistribution and use in source and binary forms, with or without |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
modification, are permitted provided that the following conditions are met: |
70 |
compared just like that. |
compared just like that. |
71 |
|
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, host clock speed, which compiler and compiler flags |
host architecture, target architecture, host clock speed, which compiler |
74 |
were used to build the emulator, what the workload is, and so on. For |
and compiler flags were used to build the emulator, what the workload is, |
75 |
example, if an emulated operating system tries to read a block from disk, |
what additional runtime flags are given to the emulator, and so on. |
76 |
from its point of view the read was instantaneous (no waiting). So 1 MIPS |
|
77 |
in an emulated OS might have taken more than one million instructions on a |
<p>Devices are generally not timing-accurate: for example, if an emulated |
78 |
real machine. |
operating system tries to read a block from disk, from its point of view |
79 |
|
the read was instantaneous (no waiting). So 1 MIPS in an emulated OS might |
80 |
|
have taken more than one million instructions on a real machine. |
81 |
|
|
82 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
<p>Also, if the emulator says it has executed 1 million instructions, and |
83 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
the CPU family in question was capable of scalar execution (i.e. one cycle |
87 |
|
|
88 |
<p>Because of these issues, it is in my opinion best to measure |
<p>Because of these issues, it is in my opinion best to measure |
89 |
performance as the actual (real-world) time it takes to perform a task |
performance as the actual (real-world) time it takes to perform a task |
90 |
with the emulator. Typical examples would be "How long does it take to |
with the emulator, e.g.: |
|
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD |
|
|
in the emulator?". |
|
|
|
|
|
<p>So, how fast is it? :-) Answer: it varies. |
|
|
|
|
|
<p>The emulation technique used varies depending on which processor type |
|
|
is being emulated. (One of my main goals with GXemul is to experiment with |
|
|
different kinds of emulation, so these might change in the future.) |
|
91 |
|
|
92 |
<ul> |
<ul> |
93 |
<li><b>MIPS:</b><br> |
<li>"How long does it take to install NetBSD onto a disk image?" |
94 |
There are two emulation modes. The most important one is an |
<li>"How long does it take to compile XYZ inside NetBSD |
95 |
implementation of a <i>dynamic binary translator</i>. |
in the emulator?". |
|
(Compared to real binary translators, though, GXemul's bintrans |
|
|
subsystem is very simple and does not perform very well.) |
|
|
This mode can be used on Alpha and i386 host. The other emulation |
|
|
mode is simple interpretation, where an instruction is read from |
|
|
emulated memory, and interpreted one-at-a-time. (Slow, but it |
|
|
works. It can be forcefully used by using the <tt>-B</tt> command |
|
|
line option.) |
|
|
<p> |
|
|
<li><b>All other modes:</b><br> |
|
|
These use a kind of dynamic translation system. (This system does |
|
|
not use host-specific backends, so it is not "recompilation" or |
|
|
anything like that.) Speed is slower than real binary translation, |
|
|
but faster than traditional interpretation, and with some tricks |
|
|
it will hopefully still give reasonable speed. The ARM and PowerPC |
|
|
emulation modes uses this kind of translation. |
|
96 |
</ul> |
</ul> |
97 |
|
|
98 |
|
<p>So, how fast is it? :-) Answer: it varies. |
99 |
|
|
100 |
|
|
101 |
|
|
102 |
|
|
103 |
|
|
312 |
|
|
313 |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
314 |
responsible for one hardware device. These are used from |
responsible for one hardware device. These are used from |
315 |
<tt>src/machine.c</tt>, when initializing which hardware a particular |
<tt>src/machines/machine_*.c</tt>, when initializing which hardware a particular |
316 |
machine model will be using, or when adding devices to a machine using the |
machine model will be using, or when adding devices to a machine using the |
317 |
<tt>device()</tt> command in configuration files. |
<tt>device()</tt> command in configuration files. |
318 |
|
|
327 |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
<li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It |
328 |
would typically look something like this: |
would typically look something like this: |
329 |
<pre> |
<pre> |
330 |
/* |
DEVINIT(foo) |
|
* devinit_foo(): |
|
|
*/ |
|
|
int devinit_foo(struct devinit *devinit) |
|
331 |
{ |
{ |
332 |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
333 |
|
|
335 |
fprintf(stderr, "out of memory\n"); |
fprintf(stderr, "out of memory\n"); |
336 |
exit(1); |
exit(1); |
337 |
} |
} |
338 |
memset(d, 0, sizeof(struct foon_data)); |
memset(d, 0, sizeof(struct foo_data)); |
339 |
|
|
340 |
/* |
/* |
341 |
* Set up stuff here, for example fill d with useful |
* Set up stuff here, for example fill d with useful |
359 |
} |
} |
360 |
</pre><br> |
</pre><br> |
361 |
|
|
362 |
|
<p><tt>DEVINIT(foo)</tt> is defined as <tt>int devinit_foo(struct devinit *devinit)</tt>, |
363 |
|
and the <tt>devinit</tt> argument contains everything that the device driver's |
364 |
|
initialization function needs. |
365 |
|
|
366 |
|
<p> |
367 |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
<li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct |
368 |
should be defined. |
should be defined. |
369 |
<pre> |
<pre> |
407 |
<p> |
<p> |
408 |
<li>And last but not least, the device should have an access function. |
<li>And last but not least, the device should have an access function. |
409 |
The access function is called whenever there is a load or store |
The access function is called whenever there is a load or store |
410 |
to an address which is in the device' memory mapped region. |
to an address which is in the device' memory mapped region. To |
411 |
<pre> |
simplify things a little, a macro <tt>DEVICE_ACCESS(x)</tt> |
412 |
int dev_foo_access(struct cpu *cpu, struct memory *mem, |
is expanded into<pre> |
413 |
|
int dev_x_access(struct cpu *cpu, struct memory *mem, |
414 |
uint64_t relative_addr, unsigned char *data, size_t len, |
uint64_t relative_addr, unsigned char *data, size_t len, |
415 |
int writeflag, void *extra) |
int writeflag, void *extra) |
416 |
|
</pre> The access function can look like this: |
417 |
|
<pre> |
418 |
|
DEVICE_ACCESS(foo) |
419 |
{ |
{ |
420 |
struct foo_data *d = extra; |
struct foo_data *d = extra; |
421 |
uint64_t idata = 0, odata = 0; |
uint64_t idata = 0, odata = 0; |