This feature is enabled with the build flag BOXEDWINE_VM, and requires BOXEDWINE_64BIT_MMU (see hard MMU).
This is a big change from how the normal CPU emulation works, for one, it is multi threaded. So each emulated thread will have its on host thread. This isn’t even possible on the normal CPU emulator because then I would have to implement the lock instruction which would just make everything slower. For the x64 I am able to use the hardware lock instruction.
When translating x86 instructions to x64, some of them are really simple and require no change. For example
mov eax, ecx
Would require no change, the machine code is exactly the same.
Some instructions have been removed. Like “inc eax” in x86 can be done in two ways, with machine code “40” and with “ff c0”. The first one was removed in x64 to make room for a new prefix that allows the machine code to know if it dealing with a 32-bit or 64-bit register.
Anything with memory needs to be rewritten to take into account the emulated MMU. With hard MMU (which this requires), it means just adding a simple offset. So for example
mov eax, [DS:eax]
would have to be changed to
lea r13d, [r15+eax] mov eax, [r10+r13]
where
- r13 is a tmp register
- r15 register holds the DS segment address
- r10 holds the memory offset
So in the best case scenario, memory access instructions will become two instructions, thus slowing things down a little bit.
In the above scenario I used “lea” instead of “add” so that it would not affect the CPU flags.
x64 has 8 extra registers. I use them like this:
- r8, r12, r13 as tmp
- r9 holds pointer to CPU structure
- r10 hold memory offset
- r11 hold stack pointer
- r14 holds SS segment address
- r15 holds DS segment address
Notice that I use my own stack register instead of rsp, this is because I will need to push/pop 2 and 4 byte values where as x64 rsp would expect to be 8 byte aligned.
So what happens if we need to push a register on to the stack?
push ebp
This will become
pushfq // we don't want this push instruction to change CPU flags lea r8d,[r11-4] // calculate where we will write the value (ESP-4) and r8d,dword ptr [r9+2Ch] // and that value with the stack mask (r9 is cpu, 2C is offset to stack mask) lea r8d,[r8+r14] // add stack segment to address mov dword ptr [r8+r10],ebp // puts ebp on the stack (might cause exception, which is why esp isn't updated yet) and r11d,dword ptr [r9+30h] // and the original esp value with the not stack mask or r11d,r8d // or the new stack with the old popfq
in c code this looks like
void push32(struct CPU* cpu, U32 value) { U32 new_esp=(ESP & cpu->stackNotMask) | ((ESP - 4) & cpu->stackMask); writed(cpu->thread, cpu->segAddress[SS] + (new_esp & cpu->stackMask) ,value); ESP = new_esp; }