I wanted to learn how to write a CPU emulator, because it seemed like fun! I ended up writing a disassembler too. GitHub.

I came across this “literate program” on how to write one in C++. I wanted to do it in Python, because it seemed like it would be more educational if I needed to transform the presented C++ into Python, and I was in the mood to learn more Python.

Writing this reminded me about the more tedious parts of CPU design — namely heavily redundant code such as when you need to pull the same bits out of different instructions for the same purpose. For example, if an instruction has a destination register, it’s often in the same location (in this particular ISA). So, I commonly repeat code like pc_offset_9 = instruction & 0x1ff.

The linked C++ solves this by using a templated function with bitfields (ints that hold whether a particular instruction will execute that snippet of reused code). I tried doing something similar with Python decorators, but I ended up just parsing every field for every instruction, which isn’t super efficient, but did result in slightly more concise code.

Apparently PyPy is faster than CPython, and CPython doesn’t JIT — it just runs Python bytecode in a Python VM. PyPy actually JITs which means that it’s faster (even though it’s written in RPython, a subset, instead of C).