|
About 
History 
Implementation
Details 
Download 
Installation 
Links 
Acknowledgements 
|
 |
|
File Names
Most of the time, there is a file className.h (the header), and a file
className.c (the implementation). Sometimes, there is also a file className.inline.h
that holds the implementation of inline functions for that class. However,
at some places there is className.h and className.dcl.h. Sometimes the
class definition is in .h and inline functions are in .dcl.h, and sometimes
vice versa. To simplify my grepping through the source, I felt as free
to rename the files containing inline code into .inline.h.
Additionally, I started all files containing i386 specifics with 'i386-'.
I did not use 'i386.' as 'sun4.' because I started in vm/asm with files
'sparc-*'. So it is consistent with other GNU naming rules, and it
also simplifies my grepping.
Assembler
The assembler has been derived from the machine specific parts of gas (GNU
binutils). It is wrapped into a similar kind of object oriented shell as
the Sparc one is. (See directory vm/asm, especially i386*)
The supported instructions are:
mov, push, pop, pusha, popa, add, sub, inc, dec, neg, mul, imul,
div, idiv, cmp, test, and, or, xor, not, shl, shr, sal, sar, jmp, jcc,
call, ret, lea, clr, nop, align
As in the Sparc assembler, there is a function for each assembler instruction
that is called like the instruction's name with the first letter capitalized
(f.e. Call). Other than the Sparc assembler, the operand type
is not given in the function name. The functions do not take locations
or numbers as operands but object of class Operand. This class
has several subclasses for different addressing modes:
class name |
description |
assembler notation |
R |
register direct |
%eax |
I |
immediate |
$3 |
M |
memory |
my_label |
B |
based |
(%ebx) |
Bd |
based with displacement |
4(%ebx) |
Id |
indexed with displacement |
4(,2,%eax) |
BId |
based with index and displacement |
4(%ebx,2,%eax) |
PC |
PC relative (for jumps) |
|
Md |
memory direct (for jumps), creates a relocated PC relative jump |
|
Operand type is given separately for each operand. The generated AddrDescs
contain offsets to the respective values instead of the whole instruction
as in Sparc code.
Register Usage
i386 has 8 registers:
Name |
Description |
EAX |
Return value |
EDX |
call clobbered, temp |
ECX |
call clobbered, temp |
EBX |
call saved, local |
ESI |
call saved, ByteMapBase |
EDI |
call saved, currentProcess |
EBP |
call saved, frame pointer (FP) |
ESP |
hardwired stack pointer (SP) |
I wanted to have as little trouble when calling C code as possible,
so I stuck to the C calling convention as much as possible. EAX is used
for result passing. ECX and EDX are temporary registers, EBX is a 'local'
register, ESI is used to hold the ByteMapBase pointer, and EDI points to
the currentProcess where all other values that uses Sparc global registers
are stored. EBP is used for the frame pointer.
In the future, if byte map base is compiled in as absolute address,
and if some new kind of AddrDesc is used for updating all referring locations
if the base address changes, ESI can be used as an additional local register.
The register for currentProcess can also be abandoned in favor of the global
variable in the future.
Stack Frame
Since there is no register saving mechanism in i386 as in Sparc, and consequently, the
receiver has to go into a stack location anyway, the receiver
is stored on the stack conforming to the C calling convention. In absence
of 'branch and link' instructions, calls to 'recompile' or 'di' are calls
that push its return address on the stack.
SP --> |
outgoing receiver |
|
outgoing
arguments |
|
local
slots |
|
saved registers (currently EBX) |
|
current_pc |
|
pc_chain |
FP --> |
saved EBP |
|
return address |
|
incoming receiver |
|
incoming
arguments |
SendDescs
On a method call, more detailed information about the state are stored
inside the calling code that can be referenced by the return address. Although
I do not like filling the I-Cache with data, I did not see a simple way
to remove this. The sending call instruction is aligned oddly in order
to produce a 32 bit aligned return address. The called code return with
'ret' which can make use of a call-return buffer sometimes found in i386
CPUs. The following 2 words (32 bit quantities) which are occupied with
a call and a delay slot in Sparc, are used for a jump that jumps over the
whole data section that follows.
Stack Walking
A big difference to Sparc is as follows: Sparc calls a subroutine and saves
the return address to a register. A stack frame is then atomically allocated
with a 'save' instruction that stores the return address along with other
saved valued on the stack (lazily). On i386, there is first the call that
pushed the return address, then comes a push of the old FP, and then new
space is allocated by subtracting from the stack pointer. If you look at
the stack pointer you never really know whether it points to the top of
a stack frame, to a new return address, or to a saved FP. Actually stack
walking code should orient itself at the frame pointer (EBP).
The code that possibly walks the stack is in directory 'vm/runtime'
in files frame, stack, process, .[ch] respectively. A stack frame 'frame'
is composed of two 'halfFrame's that represent the incoming and outgoing
end. A Sparc stack starts at the current stack pointer (SP), and winds
upwards as a linked list where the stack pointer is called frame pointer
(FP) once it is saved. However, on i386 this list is rooted on the current
frame pointer (EBP). To obtain the location of the outgoing arguments
of the topmost frame, it is probably necessary to look at the executing
code for its frame size. Furthermore, since Sparc saves from outgoing to
incoming, and then to the stack, while i386 saves on the stack in the first
place, I assume that the access to arguments or local variables must be
un-shifted by one frame.
|
|