10 |
|
|
11 |
<!-- |
<!-- |
12 |
|
|
13 |
$Id: technical.html,v 1.72 2006/02/18 15:18:15 debug Exp $ |
$Id: technical.html,v 1.74 2006/06/17 10:16:22 debug Exp $ |
14 |
|
|
15 |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
16 |
|
|
70 |
compared just like that. |
compared just like that. |
71 |
|
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, host clock speed, which compiler and compiler flags |
host architecture, target architecture, host clock speed, which compiler |
74 |
were used to build the emulator, what the workload is, and so on. For |
and compiler flags were used to build the emulator, what the workload is, |
75 |
example, if an emulated operating system tries to read a block from disk, |
what additional runtime flags are given to the emulator, and so on. |
76 |
from its point of view the read was instantaneous (no waiting). So 1 MIPS |
|
77 |
in an emulated OS might have taken more than one million instructions on a |
<p>Devices are generally not timing-accurate: for example, if an emulated |
78 |
real machine. |
operating system tries to read a block from disk, from its point of view |
79 |
|
the read was instantaneous (no waiting). So 1 MIPS in an emulated OS might |
80 |
|
have taken more than one million instructions on a real machine. |
81 |
|
|
82 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
<p>Also, if the emulator says it has executed 1 million instructions, and |
83 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
the CPU family in question was capable of scalar execution (i.e. one cycle |
87 |
|
|
88 |
<p>Because of these issues, it is in my opinion best to measure |
<p>Because of these issues, it is in my opinion best to measure |
89 |
performance as the actual (real-world) time it takes to perform a task |
performance as the actual (real-world) time it takes to perform a task |
90 |
with the emulator. Typical examples would be "How long does it take to |
with the emulator, e.g.: |
|
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD |
|
|
in the emulator?". |
|
|
|
|
|
<p>So, how fast is it? :-) Answer: it varies. |
|
|
|
|
|
<p>The emulation technique used varies depending on which processor type |
|
|
is being emulated. (One of my main goals with GXemul is to experiment with |
|
|
different kinds of emulation, so these might change in the future.) |
|
91 |
|
|
92 |
<ul> |
<ul> |
93 |
<li><b>MIPS:</b><br> |
<li>"How long does it take to install NetBSD onto a disk image?" |
94 |
There are two emulation modes. The most important one is an |
<li>"How long does it take to compile XYZ inside NetBSD |
95 |
implementation of a <i>dynamic binary translator</i>. |
in the emulator?". |
|
(Compared to real binary translators, though, GXemul's bintrans |
|
|
subsystem is very simple and does not perform very well.) |
|
|
This mode can be used on Alpha and i386 host. The other emulation |
|
|
mode is simple interpretation, where an instruction is read from |
|
|
emulated memory, and interpreted one-at-a-time. (Slow, but it |
|
|
works. It can be forcefully used by using the <tt>-B</tt> command |
|
|
line option.) |
|
|
<p> |
|
|
<li><b>All other modes:</b><br> |
|
|
These use a kind of dynamic translation system. This system does |
|
|
not recompile anything into native code, it only uses tables of |
|
|
pointers to functions written in (sometimes machine-generated) C |
|
|
code. Speed is lower than what can be achieved using real binary |
|
|
translation into native code, but higher than when traditional |
|
|
interpretation is used. With some tricks, it will hopefully still |
|
|
give reasonable speed. The ARM and PowerPC |
|
|
emulation modes use this kind of translation. |
|
96 |
</ul> |
</ul> |
97 |
|
|
98 |
|
<p>So, how fast is it? :-) Answer: it varies. |
99 |
|
|
100 |
|
|
101 |
|
|
102 |
|
|
103 |
|
|