CPU Emulation

This page describe the base CPU emulating that will work on all platforms.  Other CPU emulators are here:

The CPU emulation source in the c/source/emulation/cpu directory and is based on Dosbox’s CPU normal and dynarec cores.  It will read in a few instructions and generate a block.  This block consists of a linked list of op’s, each op maps to a single x86 instruction.  In the op it will have a pointer to the function to run that emulates the instruction as well as the decoded data, like which registers will be used and constants.  The last op in the list is an instruction that can result in a jump, so instructions like retn, call, jmp, jo, etc.  Each op has a pointer to the next op, so when a block is executed, it will execute the function associated with each op and that function will then call the next op.  After the block returns it will then look up the next block in a hashmap cache based on the current eip (instruction pointer).  If it doesn’t find it then it will decode it and create a block and put that block in the cache.  For some instructions, like conditional jumps, it will always jump to either the next instruction or an instruction at a fixed offset.  Since both options are constant the next blocks can be store in the block itself resulting in a performance gain by not looking it up in the hash table.

An expensive part of CPU emulation is CPU flag calculation (See Wikipedia for flag details).  For example, when an add happens the flags for sign, overflow, carry over, zero, etc need to be set.  Boxedwine follows the same pattern of lazy flag calculation that Dosbox uses.  The left, right values and the result of the calculation are all stored as well as the type of calculation, this allows the flag that is needed to be calculated on the fly.  Here is an example add instruct that adds a register and a constant.

void OPCALL add32_reg(struct CPU* cpu, struct Op* op) {
    cpu->dst.u32 = cpu->reg[op->r1].u32;
    cpu->src.u32 = op->data1;
    cpu->result.u32 = cpu->dst.u32 + cpu->src.u32;
    cpu->lazyFlags = FLAGS_ADD32;
    cpu->reg[op->r1].u32 =  cpu->result.u32;
    CYCLES(1);
    NEXT();
}

 

FLAGS_ADD32 is actually a structure of function pointers, one function pointer for each flag.  So if the carry flag (CF) is needed, the code will call cpu->lazyFlags->getCF and when lazyFlags is FLAGS_ADD32, getCF_add32 will be called.

U32 getCF_add32(struct CPU* cpu) {return cpu->result.u32<cpu->dst.u32;}
U32 getOF_add32(struct CPU* cpu) {return ((cpu->dst.u32 ^ cpu->src.u32 ^ 0x80000000) & (cpu->result.u32 ^ cpu->src.u32)) & 0x80000000;}
U32 getAF_add32(struct CPU* cpu) {return ((cpu->dst.u32 ^ cpu->src.u32) ^ cpu->result.u32) & 0x10;}

struct LazyFlags flagsAdd32 = {getCF_add32, getOF_add32, getAF_add32, getZF_32, getSF_32, getPF_8};
struct LazyFlags* FLAGS_ADD32 = &flagsAdd32;

 

When caching blocks of decoded instructions we need to watch out for self modifying code.  To do this, the code will monitor writes to pages that have code that have been decoded and cached. See Memory for more information on memory emulation.

For the hard MMU, it marks each page that contains code read-only then catches the exception.  For the soft MMU, which is easier to understand, this is how it is done:

Each page of memory has a structure that contains functions that will know how to read and write to that page.  For pages with code, it will be

struct Page codePage = {ram_readb, code_writeb, ram_readw, code_writew, ram_readd, code_writed, ram_clear, code_physicalAddress};

So for reads, it will just use the normal ram functions, but for writes it will check if there is any code cached at that location and if so then clear the cache before forwarding the write call on to the normal ram_write call

static void code_writeb(struct KThread* thread, U32 address, U8 value) {
    if (value!=readb(thread, address)) {
        struct Memory* memory = thread->process->memory;
        int index = address >> PAGE_SHIFT;
        U32 ram = memory->ramPage[index];

        removeBlockAt(thread, address);
        host_writeb(address-TO_TLB(ram, address), value);
    }
}