10 |
|
|
11 |
<!-- |
<!-- |
12 |
|
|
13 |
$Id: technical.html,v 1.72 2006/02/18 15:18:15 debug Exp $ |
$Id: technical.html,v 1.76 2007/06/15 18:07:08 debug Exp $ |
14 |
|
|
15 |
Copyright (C) 2004-2006 Anders Gavare. All rights reserved. |
Copyright (C) 2004-2007 Anders Gavare. All rights reserved. |
16 |
|
|
17 |
Redistribution and use in source and binary forms, with or without |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
modification, are permitted provided that the following conditions are met: |
70 |
compared just like that. |
compared just like that. |
71 |
|
|
72 |
<p>Performance depends on several factors, including (but not limited to) |
<p>Performance depends on several factors, including (but not limited to) |
73 |
host architecture, host clock speed, which compiler and compiler flags |
host architecture, target architecture, host clock speed, which compiler |
74 |
were used to build the emulator, what the workload is, and so on. For |
and compiler flags were used to build the emulator, what the workload is, |
75 |
example, if an emulated operating system tries to read a block from disk, |
what additional runtime flags are given to the emulator, and so on. |
76 |
from its point of view the read was instantaneous (no waiting). So 1 MIPS |
|
77 |
in an emulated OS might have taken more than one million instructions on a |
<p>Devices are generally not timing-accurate: for example, if an emulated |
78 |
real machine. |
operating system tries to read a block from disk, from its point of view |
79 |
|
the read was instantaneous (no waiting). So 1 MIPS in an emulated OS might |
80 |
|
have taken more than one million instructions on a real machine. |
81 |
|
|
82 |
<p>Also, if the emulator says it has executed 1 million instructions, and |
<p>Also, if the emulator says it has executed 1 million instructions, and |
83 |
the CPU family in question was capable of scalar execution (i.e. one cycle |
the CPU family in question was capable of scalar execution (i.e. one cycle |
87 |
|
|
88 |
<p>Because of these issues, it is in my opinion best to measure |
<p>Because of these issues, it is in my opinion best to measure |
89 |
performance as the actual (real-world) time it takes to perform a task |
performance as the actual (real-world) time it takes to perform a task |
90 |
with the emulator. Typical examples would be "How long does it take to |
with the emulator, e.g.: |
|
install NetBSD?", or "How long does it take to compile XYZ inside NetBSD |
|
|
in the emulator?". |
|
|
|
|
|
<p>So, how fast is it? :-) Answer: it varies. |
|
|
|
|
|
<p>The emulation technique used varies depending on which processor type |
|
|
is being emulated. (One of my main goals with GXemul is to experiment with |
|
|
different kinds of emulation, so these might change in the future.) |
|
91 |
|
|
92 |
<ul> |
<ul> |
93 |
<li><b>MIPS:</b><br> |
<li>"How long does it take to install NetBSD onto a disk image?" |
94 |
There are two emulation modes. The most important one is an |
<li>"How long does it take to compile XYZ inside NetBSD |
95 |
implementation of a <i>dynamic binary translator</i>. |
in the emulator?". |
|
(Compared to real binary translators, though, GXemul's bintrans |
|
|
subsystem is very simple and does not perform very well.) |
|
|
This mode can be used on Alpha and i386 host. The other emulation |
|
|
mode is simple interpretation, where an instruction is read from |
|
|
emulated memory, and interpreted one-at-a-time. (Slow, but it |
|
|
works. It can be forcefully used by using the <tt>-B</tt> command |
|
|
line option.) |
|
|
<p> |
|
|
<li><b>All other modes:</b><br> |
|
|
These use a kind of dynamic translation system. This system does |
|
|
not recompile anything into native code, it only uses tables of |
|
|
pointers to functions written in (sometimes machine-generated) C |
|
|
code. Speed is lower than what can be achieved using real binary |
|
|
translation into native code, but higher than when traditional |
|
|
interpretation is used. With some tricks, it will hopefully still |
|
|
give reasonable speed. The ARM and PowerPC |
|
|
emulation modes use this kind of translation. |
|
96 |
</ul> |
</ul> |
97 |
|
|
98 |
|
<p>So, how fast is it? :-) Answer: it varies. |
99 |
|
|
100 |
|
|
101 |
|
|
102 |
|
|
103 |
|
|
107 |
<a name="net"></a> |
<a name="net"></a> |
108 |
<h3>Networking</h3> |
<h3>Networking</h3> |
109 |
|
|
110 |
<font color="#ff0000">NOTE/TODO: This section is very old and a bit |
<font color="#ff0000">NOTE/TODO: This section is very old.</font> |
|
out of date.</font> |
|
111 |
|
|
112 |
<p>Running an entire operating system under emulation is very interesting |
<p>Running an entire operating system under emulation is very interesting |
113 |
in itself, but for several reasons, running a modern OS without access to |
in itself, but for several reasons, running a modern OS without access to |
309 |
<a name="devices"></a> |
<a name="devices"></a> |
310 |
<h3>Emulation of hardware devices</h3> |
<h3>Emulation of hardware devices</h3> |
311 |
|
|
312 |
Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is |
Each file called <tt>dev_*.c</tt> in the |
313 |
|
<a href="../src/devices/"><tt>src/devices/</tt></a> directory is |
314 |
responsible for one hardware device. These are used from |
responsible for one hardware device. These are used from |
315 |
<tt>src/machines/machine_*.c</tt>, when initializing which hardware a particular |
<a href="../src/machines/"><tt>src/machines</tt></a><tt>/machine_*.c</tt>, |
316 |
machine model will be using, or when adding devices to a machine using the |
when initializing which hardware a particular machine model will be using, |
317 |
<tt>device()</tt> command in configuration files. |
or when adding devices to a machine using the <tt>device()</tt> command in |
318 |
|
<a href="configfiles.html">configuration files</a>. |
319 |
|
|
320 |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
<p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all |
321 |
these examples. This is pseudo code, it might need some modification to |
these examples. This is pseudo code, it might need some modification to |
330 |
<pre> |
<pre> |
331 |
DEVINIT(foo) |
DEVINIT(foo) |
332 |
{ |
{ |
333 |
struct foo_data *d = malloc(sizeof(struct foo_data)); |
struct foo_data *d; |
334 |
|
|
335 |
if (d == NULL) { |
CHECK_ALLOCATION(d = malloc(sizeof(struct foo_data))); |
|
fprintf(stderr, "out of memory\n"); |
|
|
exit(1); |
|
|
} |
|
336 |
memset(d, 0, sizeof(struct foo_data)); |
memset(d, 0, sizeof(struct foo_data)); |
337 |
|
|
338 |
/* |
/* |
339 |
* Set up stuff here, for example fill d with useful |
* Set up stuff here, for example fill d with useful |
340 |
* data. devinit contains settings like address, irq_nr, |
* data. devinit contains settings like address, irq path, |
341 |
* and other things. |
* and other things. |
342 |
* |
* |
343 |
* ... |
* ... |
344 |
*/ |
*/ |
345 |
|
|
346 |
|
INTERRUPT_CONNECT(devinit->interrupt_path, d->irq); |
347 |
|
|
348 |
memory_device_register(devinit->machine->memory, devinit->name, |
memory_device_register(devinit->machine->memory, devinit->name, |
349 |
devinit->addr, DEV_FOO_LENGTH, |
devinit->addr, DEV_FOO_LENGTH, |
368 |
should be defined. |
should be defined. |
369 |
<pre> |
<pre> |
370 |
struct foo_data { |
struct foo_data { |
371 |
int irq_nr; |
struct interrupt irq; |
372 |
/* ... */ |
/* ... */ |
373 |
} |
} |
374 |
</pre><br> |
</pre><br> |
375 |
(There is an exception to this rule; ugly hacks which allow |
(There is an exception to this rule; some legacy code and other |
376 |
code in <tt>src/machine.c</tt> to use some structures makes it |
ugly hacks have their device structs defined in |
377 |
necessary to place the <tt>struct foo_data</tt> in |
<tt>src/include/devices.h</tt> instead of <tt>dev_foo.c</tt>. |
378 |
<tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt> |
New code should not add stuff to <tt>devices.h</tt>.) |
|
itself. This is useful for example for interrupt controllers.) |
|
379 |
<p> |
<p> |
380 |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
<li>If <tt>foo</tt> has a tick function (that is, something that needs to be |
381 |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick |
383 |
<pre> |
<pre> |
384 |
#define FOO_TICKSHIFT 14 |
#define FOO_TICKSHIFT 14 |
385 |
|
|
386 |
void dev_foo_tick(struct cpu *cpu, void *extra) |
DEVICE_TICK(foo) |
387 |
{ |
{ |
388 |
struct foo_data *d = (struct foo_data *) extra; |
struct foo_data *d = extra; |
389 |
|
|
390 |
if (.....) |
if (.....) |
391 |
cpu_interrupt(cpu, d->irq_nr); |
INTERRUPT_ASSERT(d->irq); |
392 |
else |
else |
393 |
cpu_interrupt_ack(cpu, d->irq_nr); |
INTERRUPT_DEASSERT(d->irq); |
394 |
} |
} |
395 |
</pre><br> |
</pre><br> |
396 |
|
|
419 |
struct foo_data *d = extra; |
struct foo_data *d = extra; |
420 |
uint64_t idata = 0, odata = 0; |
uint64_t idata = 0, odata = 0; |
421 |
|
|
422 |
idata = memory_readmax64(cpu, data, len); |
if (writeflag == MEM_WRITE) |
423 |
|
idata = memory_readmax64(cpu, data, len); |
424 |
|
|
425 |
switch (relative_addr) { |
switch (relative_addr) { |
426 |
/* .... */ |
|
427 |
|
/* Handle accesses to individual addresses within |
428 |
|
the device here. */ |
429 |
|
|
430 |
|
/* ... */ |
431 |
|
|
432 |
} |
} |
433 |
|
|
434 |
if (writeflag == MEM_READ) |
if (writeflag == MEM_READ) |