Chip8 - A CHIP-8 / SuperCHIP Interpreter

Recently I decided life wasn't complicated enough, so I thought I'd turn my hand to emulation programming. It appears to be a very complex topic to get in to, and resources on the internet are rarely clear cut. After a bit of research though it became apparent that the general recommendation was to not start with an emulator per se, rather to look at the 1977 bytecode language CHIP-8, and write an interpreter for it. An interpreter differs slightly from an emulator, as, in this case, the CHIP-8 bytecode is interpreted on the fly, by an intermediate layer of software, into executable code for that platform. This means that a single CHIP-8 program can be run on multiple pieces of hardware without any changes, assuming that that hardware has an interpreter available for it. This was a common approach used by 8-bit machines of the 80s when running BASIC, and is the foundation of modern languages such as Java and anything which uses the .net/mono framework. There are a good deal of resources on CHIP-8, a language simplified by the fact that it only has 35 opcodes, 45 when extended to SuperCHIP, so I'll not go into detail here, opting instead to link some of the articles I found most useful:

Matthew Mikolay's CHIP-8 breakdown
Cowgod's technical reference
The Cosmac VIP Manual (the original hardware implementing a CHIP-8 interpreter)
Laurence Muller's coding tutorial

The latter is a very useful link when writing your own interpreter, and is the basis for Chip8 (my own implementation). Chip8 differs slightly from the interpreter outlined in the article in that it uses SFML for input and graphics (of course!) and also implements the extended SuperCHIP opcodes. If you are interested in writing your own implementation I highly recommend Laurence's article (as well as the other links for details of all the opcodes), and of course you can check out the source of Chip8 which is released under the zlib license. Here's a video of Chip8 in action:

Rather than talk about Chip8's internals, which are already well documented in the aforementioned articles, I thought I'd write something about the CHIP-8 bytecode itself. When writing Chip8 I wanted a program which would test opcodes for me as I implemented them, so I set about writing a short piece of bytecode, that can be found in the Chip8 source here. The bytecode is initialised directly into an array which can be loaded into the interpreter's virtual memory. The test program can be seen in action at the beginning of the video, which displays a message saying Test OK! along with a logo.

CHIP-8 bytecode varies slightly from most common bytecode formats used in interpreters in that each opcode is 16 bits wide rather than 8. This is because the opcode parameters are encoded in the opcode itself, rather than the following 2 or more bytes as is more common in other languages. CHIP-8 opcodes are big endian, and can be read from memory like so:

uint16_t opcode = (pc[0] << 8) | pc[1];

where pc is the program counter pointing into the interpreter program memory. This is, therefore, also incremented two bytes at a time. A typical opcode, such as JMP which jumps to another address in memory, looks like this:


The upper nibble of the most significant byte, 1, tells us that this is a jump opcode. NNN are 3 values representing the address in memory to which the program counter should jump. So


will jump to address 0x24D. Once the format is understood it is quite easy to write a simple program in CHIP-8 bytecode, especially if you've ever done any programming in an assmbly language (If you haven't and are interested in learning then I highly recommend Human Resource Machine as a great introduction). One thing to note about writing in bytecode directly is that without the convenience of things such as labels, which one might use when writing assember, you are required to manually track the address of every single opcode in memory. This can become a pain, particularly when inserting a new line somewhere, as this requires updating all opcodes, such as jumps, which use an address as a parameter. In other languages removing lines would be easier, as an opcode could be replaced with a NOP (no operation) which would keep the memory addresses aligned, but in CHIP-8 there is no NOP opcode so we're out of luck. This is why in my code you'll notice a comment on every line starting with 0x2XX, the address in memory at which the current line of code starts. In the CHIP-8 interpreter all programs are loaded at 0x200 in memory, so this is the base address. I recommend planning out your program as thoroughly as possible in advance as inserting a line, updating all the comments then seeking out all opcodes which need updating can quickly become tedious. With that in mind let's break down the test program.

The very first opcode is a jump. Looking at the example above we can tell that this jumps to 0x248 in memory. This is because static data used to represent sprites (in this case font sprites) and subroutines are placed at the beginning of the program. The jump simply skips this data and jumps to the beginning of the executable code. The characters in the reduced font set are used to display the test text, drawn as a series of sprites. Sprites in CHIP-8 are drawn in rows, each row of 8 pixels is represented by one byte, with up to 15 bytes representing 15 rows (0-F in hex). If a bit in a row is 1 then a pixel is drawn, else if it is 0 it is not drawn. Rows start at the top of the sprite and are drawn moving down the screen. The characters in this particular case are 4 pixels wide, so the lower nibble is always 0, and are 5 rows high, making each character 5 bytes in size. The comments next to each row of bytes in the source code describe the character which the sprite is meant to represent. After the font set there is an empty byte used to pad the current address to an even value. This is because the CHIP-8 spec requires all opcodes to start on an even value address, although in practice I have found that it doesn't appear to matter.

The first piece of executable code is a subroutine. This is used to create a small delay between drawing each character (as well as test the subroutine opcodes), before drawing the character itself. CHIP-8 includes a timer which ticks down at 60Hz. The subroutine uses this by first copying the value 7 into register V2, followed by moving the value of V2 into the timer counter. The sprite is then drawn, before entering into a loop which reads the value of V2 and checks if it is zero. If it is the program moves on to the next opcode exiting the subroutine. If it is not then the current timer value is read into V2 before jumping back to the zero value check again. This causes the subroutine to loop while the counter decrements. Eventually when the timer has counted down to zero the subroutine will exit.

Following this is the main entry point of the program, at which the jump on the very first line arrives. The code block is quite repetitive, although simple. First the I register is set to the address of the first byte of the character to draw. For example the first byte of the letter C is at 0x202. The opcode documentation tells us that 0xANNN is the opcode which sets the I register. Therefore the next line reads:

0xA2, 0x02,

Then, to decide where the sprite should be drawn on screen, the V0 and V1 registers are set to the X and Y coordinates respectively where 0, 0 is the top left corner of the display. 0x6XNN is the opcode which loads value NN into the register X. 0x2NNN calls a subroutine at address NNN. In this case we call the subroutine detailed above, which starts at 0x23A

0x22, 0x3A,

This pattern is repeated for each of the characters that make up the text 'CHIP 8 Test OK!'.

The next block is a bit more interesting. First a counter is incremented which counts the number of times the test loop has been run. If the value reaches 2 it is reset and jumps to check the state of register V4. This is used to decide whether or not to switch the test program to hi-res mode (0x00FF) or low-res mode (0x00FE). The hi-res mode is actually a SuperCHIP addition, and the opcodes are part of the extended set. This is here in the program to check the opcode implementation, as well as the graphics renderer. When the resolution is switched as is the value of register V4. When the program runs it will now toggle between display modes each time it has finished displaying the test sprites.
    Once the display mode opcodes are tested we find another block of static data. Ideally this should have been placed at the beginning of the code along with other static information but, as this was added later to test the SuperCHIP large sprite implementation, it was easier for me to insert near the end, due to the previously mentioned problem with shifting data addresses when inserting new lines of code. The block of data is, infact, a large sprite, 16 x 16 pixels in size. It works similarly to the CHIP-8 sprites, except that 2 bytes are used per row instead of 1, and the height is fixed at 16 rows, totalling 32 bytes. The opcode used to draw this is the same as the CHIP-8 opcode, except that the height value is set to 0, as large sprites always have 16 rows. The opcode immediately preceeding the sprite data is a jump, used to skip the data (if this were mission critical code the data block would certainly be at the beginning, removing the need for any jump). The sprite, when it is drawn, appears as a circular '8' logo on screen.

Finally the program tests the sound output, before entering a delay loop, similar to that of the delay in the draw subroutine. After the delay the program jumps back to the beginning. To test the sound output CHIP-8 requires only a single opcode, 0xFX18, where X is the V register from which to load the duration value. The CHIP-8 interpreter outputs a single tone all the time the sound timer value is non-zero. The timer counts down at 60Hz, the same as the delay timer, so setting the sound timer to 20 will output a tone for one third of a second.

The test program is rather simplistic, and doesn't test every opcode, but it was fun to write and interesting to learn about. Next time, however, I would probably consider using one of the assembler options provided by sites like Pong Story or, if only because the use of labels would make life easier when inserting lines of code anywhere other than the end. Searching for 'CHIP-8' on Github also reveals many interesting projects, although in varying states of completion.

Hopefully this experience will provide a gateway to future emulation projects, which I will, of course, eventually document in a future post.

Popular Posts