--- trunk/src/cpus/README_DYNTRANS 2007/10/08 16:19:23 20 +++ trunk/src/cpus/README_DYNTRANS 2007/10/08 16:20:40 30 @@ -1,20 +1,4 @@ -$Id: README_DYNTRANS,v 1.6 2005/11/24 01:15:06 debug Exp $ - -------------------------------------------------------------------- - -PPC optimizations TODO: - - find high-level bottlenecks! - inline cr0 field calculation - inline pc to pointers calculation - load/store with r1 as base - multiple load/stores in a row - all forms of branches, similar optimizations as with ARM - (conditional, link etc) - -------------------------------------------------------------------- - - +$Id: README_DYNTRANS,v 1.11 2006/07/27 02:18:07 debug Exp $ Dyntrans TODO: @@ -25,35 +9,34 @@ ------ ------- ----- ----- Alpha 32-bit 64 no ARM 32-bit, 16-bit (Thumb) 32 no - Atmel AVR 16-bit 8 no + Atmel AVR 16-bit + variable 8 no F-CPU ? ? ? + H8 16-bit 8/16 no HPPA 32-bit 64/32 yes i960 32-bit + variable 32 ? IA64 128-bit 64 no M68K 16-bit + variable 32 no - M88K ? 32 (?) ? + M88K 32-bit (+var?) 32 ? MIPS 32-bit, 16-bit (MIPS16) 64/32 yes OpenRISC ? ? ? PC532 ? 32 (?) ? POWER/PPC 32-bit 64/32 no SH 32-bit, 16-bit (SHcompact) 64/32 yes(*) SPARC 32-bit 64/32 yes + Transputer 8-bit 32/16 no x86 8-bit + variable 64/32/16 no VAX 8-bit + variable 32 no (*) Delay slot in SHcompact? - x) call/return address cache? - x) instr_call sequence analysis support? (For handtuning combinations.) x) opcode statistics support? TODO: is instr_call statistics enough? - TODO: a command line option to turn off instruction - combinations (for debugging) x) load/stores: + o) perhaps refactor/reuse common load/store code? o) support for archs that allow transparent unaligned load/stores (ppc, x86 etc) o) alignment checks ==> exceptions @@ -64,11 +47,13 @@ x) SMP: detect when an instruction such as ll/sc or cas is used, and "synchronize" approximately the number of executed instructions (or cycles) across all CPUs. + Problem: devices such as dev_mp don't work well with such a synch. + scheme. x) support for variable-length instructions (x86, m68k, i960, ...) - Solution: don't increase the next_ic between every - instruction, but let each instruction's handler do - that for itself. + Current solution: ic->arg[0] contains the length of the + instruction (in bytes), and next_ic is + automatically updated. Problem: what about instructions crossing a (virtual) page boundary? They cannot be translated once and for all :( and must be interpreted slowly! @@ -79,18 +64,11 @@ x) various register-window archs (SPARC etc) - x) Atmel AVR etc? - x) Alpha: hahaha, zapnot and inserts/extracts don't compile into very nice code :-| fix this Solution: if short assembly language snippets can be compiled on the current host, then compile such snippets for alpha_instr_zapnot etc. - x) 64-bit virtual memory translation tables (PPC, Alpha, MIPS, - HPPA, sh, amd64, etc) - - x) x86: convert to dyntrans. LOTS of stuff to consider. - - x) 88k? vax? pc532? 6502? 6800? etc + x) pc532? 6502? 6800? etc