00:00:05.560
Welcome everyone, and thank you for being here. I also want to thank the organizers for letting me speak. My talk today is going to be about making a Game Boy emulator in Ruby, but upon reflection, it's not exactly what this talk is about. It's really about how the Game Boy works and how to create an emulator for it.
00:00:13.320
My name is Colby Swandale, and you can find me on Twitter at @oxby. If you have any questions, feel free to follow me in the hallways afterwards or DM me on Twitter; I have DMs open to the public. So yeah, I created a Game Boy emulator, and I'm going to show it to you as the very first thing.
00:00:26.800
Let's give it a shot. This is The Legend of Zelda. It looks like it works, but it actually does not. There's supposed to be a ship and lightning, neither of which shows up. However, it's completely written in Ruby, with one tiny C interface into a library called SDL, which is how I'm rendering the graphics.
00:01:00.239
Actually, there’s the lightning! Okay, it works...maybe. Let's try something else. Let's show Super Mario. I like to think this one works, but as you’ll see in a second, it doesn’t. So Mario can jump, but you'll notice that the background won't even scroll, it will scroll incredibly slowly, but the Goomba works.
00:01:14.720
The one game that does work, which I am pretty happy about, is Tetris. This takes a bit of a while to boot up, but I’ll fill you in on some of the details. I started this project late in 2015, and it took about seven months to complete. A lot of it was just reading about how the games work, implementing some of the components that make it up, and then I eventually just went straight ahead and made the whole thing. I also gave a talk at RubyKaigi last year about this emulator.
00:02:12.360
If you've seen this talk before, this will be slightly different. I've added many new slides and updated a lot of the existing ones, so if you’ve seen this talk before, don’t be afraid to stick around a bit longer. Sorry, is that bad? Okay, sorry. So this is Tetris. It’s still slow, unfortunately. This emulator is really unoptimized, but you can play it if you are patient enough.
00:02:42.760
So that's the emulator. This is the Game Boy here, and... oh, that's supposed to be a GIF. There you go. This is the Game Boy. Actually, no, it's not. This is BMO from a show called Adventure Time. But this is the Game Boy. It was made by Nintendo in Japan in 1989 and went on to sell 118 million units. This figure includes later models, such as the Game Boy Color. The Game Boy had over a thousand games, including Tetris, Super Mario Land, and Pokémon Red and Blue, and it had a 15-hour battery life, which was significant at the time.
00:03:50.360
For anyone unfamiliar with what an emulator is, Wikipedia defines it as a piece of hardware or software, known as a host, that imitates another piece of hardware or software, known as a guest. If you know the terms hosting and guesting, you're probably familiar with software like virtual machines, such as VirtualBox, VMware, Xen, and KVM. They emulate various systems like RAM, CPU architectures, disks, and alike.
00:04:22.160
I like to think of emulators as exceptions. Emulators let you run things like operating systems within operating systems, if you want to go that far. If I had sound right now, I would be playing that boom sound. This talk is going to cover four major components that make up the Game Boy. I may not have enough time to break everything down, but these are the essential components that make up the Game Boy and how it works.
00:04:44.560
I will discuss how they work independently and, more importantly, how they work together to allow you to run games and play games like Pokémon Red and Blue. The first component I will talk about is the CPU. It is the most critical component in the Game Boy as it actually reads the program and executes all the instructions. It’s also known as the main integrated circuit. A lot of things make up the Game Boy, such as registers, instructions, timers, and halts, but today I will focus on only two topics: registers and instructions. The CPU in the Game Boy is the Sharp LR35902.
00:07:38.360
It's a mouthful, clocked at 4.19 MHz with an 8-bit CPU, and it has a 16-bit memory bus. To read something from an address, you have to use two bytes. It is very similar to another CPU, the Zilog Z80, and the Intel 8080 processors. So, the first topic I will discuss is registers and instructions. Registers in the Game Boy are like bits of memory that physically sit inside the CPU. They are very quick, but there are very limited numbers of them. The Game Boy has ten registers: eight 1-byte registers and two 2-byte registers.
00:09:16.680
Some of these are available for general use, while others have specific functions. The A register, for example, is known as an accumulator register. It is a general-purpose register but has special functions as well. Certain operations use the A register as a second operand and store the results of mathematical operations such as addition and subtraction. The F register, known as a status bit register, holds the status bits of the CPU, indicating states such as whether the last instruction was a subtraction or resulted in zero, a full carry, or a half carry.
00:11:41.300
The PC register, also known as the program counter register, contains the address of the next instruction to be executed by the CPU. The SP register, or stack pointer register, holds the address of the top item in the stack. To those unfamiliar, a stack is a small region of memory primarily used in the Game Boy to pass arguments to sub-routines and to hold the program counter register when the program jumps into a sub-routine.
00:13:01.760
Each of the 1-byte registers can be paired with another 1-byte register to form a single 2-byte register. These pairings are typically used to reference something in memory. For example, the A register and the F register can combine to form register AF. If I manipulate data in either register A or register F, that will be reflected in register AF. The first thing I started doing when I built this emulator was creating a class to represent our CPU. This class will encapsulate things like registers and instructions.
00:14:50.360
To emulate registers in the CPU, I created an instance variable for each register with the same name. In the instructions, as you will see in a few slides, we use these instance variables to manipulate the data in these emulated registers. The next important aspect of the CPU is the instructions. These are the actual operations the CPU will perform, including mathematical arithmetics like addition and subtraction, and rotating bits. The Game Boy has 256 instructions.
00:16:57.520
This is a small subset, as many of the instructions are slight variations of the ones you see on screen that use different registers. Instructions include loading data between registers, performing mathematical operations, and executing bitwise operations such as AND, XOR, and OR. They also include incrementing and decrementing values in registers and managing stack operations. Load operations from memory into registers or vice versa, enabling interrupts, and halting the CPU.
00:18:47.760
These sets of instructions combine to form what is known as an OP Code table. This table is embedded in the CPU and maps the instruction mnemonic—like LD A, B—into an actual OP code. Each instruction has a unique OP code, and when I want the CPU to execute an instruction, I supply the OP code instead of the mnemonic. For example, enabling interrupts has an OP code of hexadecimal 32.
00:20:52.619
When we compile a program that uses this instruction, it gets encoded into binary format to be read by the CPU, which then executes that instruction. To emulate the OP code table, I use an array. Each item in the array corresponds to the mnemonic of that instruction, with each instruction positioned in the array to give me the value of that OP code. For instance, the OP code for NOP (no operation) is 0, making it the first item in the array.
00:22:38.560
Here are some example instructions. The first is LD B, C, which loads the value of the C register into the B register. This is pretty simple, nothing too complex—just assigning variable data to each other. Here’s another instruction, INCREMENT B, where we are incrementing the value of the B register by one. Notice on the second line that we perform a bitwise AND operation on the result of that increment. This is because Ruby does not have an 8-bit data type like languages such as C.
00:24:17.880
The absence of an 8-bit data type means that values can go up to very large numbers. However, in the Game Boy, it's an 8-bit CPU, meaning it can only handle values up to 255. When you reach the value of 255, you have to flip it back to zero, which is the natural behavior of 8-bit systems. If you look at this code on GitHub, you will see results like A = 0, which is hexadecimal FF, scattered throughout the code.
00:26:31.640
Once we have registers and instructions ready to execute in our program, we need a method to execute more than a single instruction. Computers employ a process called Fetch and Execute, and this works similarly today, but it is much more complex. With the Game Boy, it will read an instruction from memory using the program counter register, interpret that instruction, and execute it in a continuous loop.
00:28:16.240
This process continues from when it is first powered on until it eventually turns off. There are exceptions to this process, like the HALT and STOP instructions, which will stop the execution temporarily. Implementing this process is straightforward. We read the instruction from memory using the memory management unit (MMU) variable, which controls memory. We read a single byte using the program counter register.
00:30:34.280
Then, we increment the program counter register by one to move through the code. The program counter register tells the CPU which byte to read next. Once we have the opcode, we look up the mnemonic of that instruction and execute it, completing the process for each instruction that runs through the CPU. Lastly, each instruction takes a particular amount of time to complete.
00:32:11.440
Instructions like NOP take what's called cycles to complete. For example, NOP takes four cycles. More complex instructions, like calling a subroutine, can take 16 cycles. Other components, such as the Picture Processing Unit and other hardware, utilize these cycles as well. To implement this timing, I created another array that defines the timing for each instruction.
00:34:12.720
Using the opcode as an index, I can look up how many cycles each instruction should take, assigning that to an instance variable available to all other components in the Game Boy.
00:36:10.319
The next component is memory. Memory acts as long-term storage in the Game Boy. It is where the program and video memory reside among other essential data. Memory is controlled by what's known as a memory management unit (MMU). In the Game Boy, rather than having a stick of RAM as you see in modern devices, memory is spread across various chips managed by different components.
00:38:05.520
The memory management unit serves as the interface to the memory architecture, allowing access to memory without worrying about which chip stores the data. The Game Boy has 64 kilobytes of memory storage and an address space of over 65,000 unique addresses, spread across multiple chips. Memory is divided into different regions catering to various needs. The first half is where the game program lives; then there's video RAM for rendering and general RAM that the program can use freely without special purposes.
00:40:56.920
At the end, there are reserved bits for I/O hardware RAM, and the sprites live there as well. To emulate memory, we will create a class to represent the memory management unit, holding all the memory inside itself. While I may not use all the available memory in this project, we define it as a full amount as it's easier than worrying about specifics.
00:43:33.080
We implement the reading functionality for memory, taking into account the different regions described earlier. For instance, if reading from hexadecimal 0 to 8,000, that data comes from the game cartridge. I haven't implemented this fully yet, but I'll cover that shortly. Reading between 8,000 and 0xFFFF corresponds to video memory which is stored in a specific array design we created. Writing to RAM works similarly, assigning values rather than only reading.
00:45:59.240
Finally, we create a global variable for the memory management unit. Passing memory around in the Game Boy without pointers or references is quite challenging, so it’s more straightforward to define it as a global variable to track the state of RAM.
00:48:09.440
Similar to the CPU, the memory has registers, but they function differently. You will notice that there are no instructions in the CPU to interact directly with hardware. Instead, all hardware communicates through memory. You have memory registers, which are specific addresses in memory assigned particular purposes to interact with specific hardware components. The memory registers live at the very end of the memory space.
00:50:03.320
There are only around 50 addresses reserved. For example, the LCDC register controls the LCD screen’s options on the Game Boy, like turning off the screen, enabling background images, and adding sprites on display. Another example is the timer for internal clock management.
00:52:11.760
The SCX and SCY registers control the viewports, allowing you to manage what the viewport displays. The program will write to a specific address in memory, while the respective hardware component continuously reads that address. For example, to make a sound in the game, you would write to a register called Noise Register 10, which resides at a specific address, and the sound hardware would keep reading that memory piece over and over again.
00:54:11.400
Next, we have the Picture Processing Unit (PPU). The PPU chip functions like today's graphics processing units. It renders an area of 255 pixels by 255 pixels, but the screen will only display 160 by 144 pixels. It can only display four shades of gray, which in the Game Boy's case translates to various shades of green, with a refresh rate of 60 Hz (60 frames per second). The PPU has only 8 kilobytes of memory, which seems minuscule.
00:56:39.960
What makes it interesting is that the PPU emulates the behavior of a CRT screen. Instead of drawing the entire frame 60 times per second, it draws each individual horizontal line on the screen in succession and, upon finishing, restarts that process 60 times per second. The job of the PPU is to take data from video memory.
00:58:24.480
Specifically, it reads what's called the Sprite Table or the Object Attribute Map (OAM), which will be explained later, and interprets that data into a signal that the screen can display. The Game Boy draws three different types of objects on the screen, each with different behaviors: backgrounds (which typically draw the world), objects (or sprites), and windows, which act like fixed backgrounds for menus.
01:00:49.080
Sprites are typically the interactable objects in a game, like characters and items, and move independently of the background. The OAM is used to place the sprite in a specific area and move it around the display. The PPU has no instructions like those in CPU; instead, it operates as a self-contained unit, requiring minimal interactions. When the PPU is drawing the lines on display, it enters four distinct operation modes: the OAM read mode, where it reads the OAM and calculates which sprites to render.
01:02:34.520
During this time, you can't update the video memory because the PPU is busy reading data and processing it. The next is the video RAM read mode, where it reads things like the background and windows, during which you still cannot update the video memory. Once a line has been drawn, it enters the horizontal blank mode, during which you are free to update video memory or sprites. Finally, after rendering all lines, the vertical blank mode is reached, which allows the PPU to prepare for rendering the first line again.
01:04:29.520
Each mode takes a particular amount of time to complete, with the OAM read taking 80 cycles, and the vertical blank mode taking 4560 cycles to finish. I won’t go too deep into PPU, as it's somewhat complex, but I will show how it maintains the cycle count just like the CPU. It accumulates the cycles from the CPU and determines which mode it should be in, initiating an event when it exceeds the timing for that mode.
01:06:16.080
As mentioned before, the PPU renders an area of 255 by 255 pixels, yet the screen only displays a 160 x 144 area. Here’s how the sprite movement illusion is created: the viewport is the visible area on the screen, which can move while the game world updates just behind it. The Game Boy employs a tile system instead of holding a frame buffer. You’ll notice many tiles in a Game Boy game repeat.
01:08:29.280
A tile is an 8 by 8 set of pixels, and the Game Boy has a data structure for reusable pixels called tiles. It can render up to 32 x 32 tiles, but only displays a 20 x 16 grid of tiles onscreen. The tiles can be reused for various parts of the game world, saving memory. Each tile takes up 16 bytes, with each pixel taking two bits of color data, allowing for four colors: white, light gray, dark gray, and black.
01:10:37.120
The pixel data is binary—white is 0, black is 1, dark gray is 00, and light gray is 01. This tile system allows for efficient use of memory as one can see the tiles moving on and off the screen, being read from and written to memory as needed via direct memory access, which is a Game Boy functionality that allows large amounts of video data to be transferred simultaneously.
01:12:44.480
When you have all these tiles loaded into memory, how do you display them on the screen? The Game Boy has a map. This map references each tile and determines where they appear on the display, used primarily for backgrounds and windows. Assigning tile references instead of actual pixel data saves significant memory.
01:15:09.760
Only a small portion of actual tile data is loaded into memory as references are used extensively. Windows function similarly, sharing the same tile data for backgrounds. Lastly, we have the Object Attribute Map (OAM), which holds tables that tell the Game Boy where and how to display sprites. The OAM also lives in the same area with hardware and I/O memory, containing attributes like X and Y positions, priority, and whether to flip.
01:17:53.080
Lastly, we discuss cartridges, which are designed to hold the game's data. You buy them at retail stores and plug them into the back of the Game Boy. There are 29 different cartridge types, which can hold up to 2 megabytes of data. Large games like Pokémon can take up two megabytes or more and have 32 kilobytes of external storage.
01:19:32.800
These cartridges may support external hardware like real-time clocks and additional storage. They utilize a memory bank controller, roughly analogous to how memory is managed. The Game Boy program lives in reserved memory space, and while the Game Boy only has 64 kilobytes of RAM, larger games use paging techniques to load sections of their content from the cartridge into RAM.
01:21:51.720
Cartridges use a system known as memory banks, where the game's data is chunked into 16 kilobyte sections with each assigned a unique index. The first chunk is mapped to bank zero, with subsequent chunks swapping into readable regions as needed. This way, the Game Boy can read 2 megabytes of game data by loading specific regions on demand. Using the appropriate memory bank controller is crucial.
01:23:40.840
I also created a cartridge class implementing the square brackets that correspond to the memory bank system. The type of cartridge is declared in the specific memory region, indicating which cartridge type it is. Different implementations are developed for each type of cartridge, which take charge of the banking.
01:25:59.820
Now, I need to be conscious of the remaining time. We manipulate and read memory through the memory controller after defining the correct region. Overall, we encapsulate all the components into an emulator class that includes parts like the CPU, the picture processing unit, and the screen. The screen functions as a simple C interface connecting to the SDL library.
01:28:10.680
It takes the frame buffer information from the PPU and the memory management unit and produces visual output. We define a method called run that continuously loops, taking from the CPU and PPU data as it renders frames to the screen when permitted. The emulator is instantiated by loading the ROM, and we just run it. I didn’t get into L management, input controls, and sound handling.
01:30:17.880
A fantastic talk worth checking out for deeper insight is 'The Ultimate Game Boy Talk', which lasts an hour and thoroughly breaks everything about the Game Boy. If you are interested in this project and wish to see the entire implementation of the emulator, it’s available on GitHub under Swatf. Thank you very much for your time.