bandarra.me

Writing Doom Fire for the Raspberry Pi Pico and the Pimoroni Pico Display

Doom Fire running on a Pi Pico

Introduction

The Doom Fire animation is fire animation used for the PSX port of the original Doom game. This animation is a nice Hello World to implement when learning new graphics APIs, and I recently wrote about a modern JavaScript implementation.

The Raspberry Pi Pico is a new board, based on the new RP2040 microcontroller and, along with the Pimoroni Pico Display makes an interesting platform to port the Doom Fire animation to.

Using MicroPython

MicroPython is an implementation of Python 3 that is optimised to run on microcontrollers.

The nice thing about MicroPython is how beginner friendly it is, as it only requires flashing a custom image and installing the Thonny IDE. The details on how to get started have been extensively covered by the official documentation, blogposts, and YouTube videos, so I won't repeat those here. I do, however, wonder why the official documentation is only available as PDF file, and not as an HTML page though.

Pimoroni has also done a great job and provides a custom firmware that makes it a breeze to use the Pico Display from MicroPython and a set of examples for the display.

If you interested on the final MicroPython implementation, check out the source code on GitHub.

In my first attempt of the implementation, I had created separate methods for updating the fire, with the update() method and rendering the outcome, with the render() method:

def update(self):
  for y in range(1, height):
    row = y * width
    next_row = (y - 1) * width
    for x in range(0, width):
      color = self.fire[row + x]
      pen = colorScale[color]

      new_x = x
      if color > 0:
          rand = random.randint(0, 3)
          color = color - (rand & 1)
          new_x = new_x + rand - 1
      self.fire[next_row + new_x] = color

def render(self):
  for y in range(0, height):
    row = y * width
    next_row = (y - 1) * width
    for x in range(0, width):
      color = self.fire[row + x]
      pen = colorScale[color]
      display.set_pen(pen)
      display.pixel(x, y)
  display.update()

This implementation works and was quick to implement, even with almost no experience with Python programming. The problem is that this implementation takes almost 4 seconds to render each frame. Yes, that's 0.25 frames per second (FPS).

The most obvious place to optimise is avoid looping over the pixels for the fire twice and implement updating and rendering at the same time, and merge the render() into update():

def update(self):
    for y in range(0, height):
        row = y * width
        next_row = (y - 1) * width
        for x in range(0, width):
            color = self.fire[row + x]
            pen = colorScale[color]

            if y > 0:
                new_x = x
                if color > 0:
                    rand = random.randint(0, 3)
                    color = color - (rand & 1)
                    new_x = new_x + rand - 1
                self.fire[next_row + new_x] = color

            display.set_pen(pen)
            display.pixel(x, y)
    display.update()

This cut the time to render to 2 seconds. That's a great improvement, but not nearly enough to run at the 27 FPS required by the Doom Fire animation.

At this point, I found unlikely that it would be worth working on improving the Python animation, but I also found unlikely that the Pico couldn't run fast enough to implement it. My guess was that MicroPython had a larger overhead than I expected.

Using C++

While the C++ process is also well documented (also as a PDF), I can't say it's as easy as getting started with MicroPython and does require installing a toolchain with a small set of tools. The documentation also covers setting up using difference IDEs. In my case, I have used CLion.

Rewriting the latest Python code in C++ looks like the following:

void update(uint32_t time) {
    for (int y = 0; y < pimoroni::PicoDisplay::HEIGHT; y++) {
        int row = y * pimoroni::PicoDisplay::WIDTH;
        int next_row = y == 0 ? 0 : (y - 1) * pimoroni::PicoDisplay::WIDTH;

        for (int x = 0; x < pimoroni::PicoDisplay::WIDTH; x++) {
            uint8_t color = fire[row + x];
            uint16_t pen = pallete[color];
            pico_display.setPen(pen);
            pico_display.setPixel(x, y);

            if (y > 0) {
                int new_x = x;
                int rand = std::rand() % 3;
                new_x = (new_x + rand - 1);
                color = color > 0 ? color - (rand & 1) : 0;
                fire[next_row + new_x] = color;
            }
        }
    }
    pico_display.update();
}

From the start this code at ~20 FPS, or around 50 ms per frame, which is a huge improvement over MicroPython but still not our 27 FPS target.

Since we're not worried with a high quality random number generator, it felt that a faster generator could help. A quick Google search took me to this StackOverflow answer, which promises being 2x the speed of std:random():

void update(uint32_t time) {
    for (int y = 0; y < pimoroni::PicoDisplay::HEIGHT; y++) {
        int row = y * pimoroni::PicoDisplay::WIDTH;
        int next_row = y == 0 ? 0 : (y - 1) * pimoroni::PicoDisplay::WIDTH;

        for (int x = 0; x < pimoroni::PicoDisplay::WIDTH; x++) {
            uint8_t color = fire[row + x];
            uint16_t pen = pallete[color];
            pico_display.setPen(pen);
            pico_display.setPixel(x, y);

            if (y > 0) {
                int new_x = x;
                int rand = fast_rand() % 3;
                new_x = (new_x + rand - 1);
                color = color > 0 ? color - (rand & 1) : 0;
                fire[next_row + new_x] = color;
            }
        }
    }
    pico_display.update();
}

And, indeed, it it improved rendering to about 37ms per frame, exacly the 27 FPS we needed.

Adding Wind

The random number generated is an integer number between 0 and 2 (inclusive) that controls how the fire in a given cell is spread:

Adding wind means that we want to add a bias to this number. If a negative bias is added, the fire will spread more to the left and if a positive bias is added, the fire will spread more to the right.

To control the wind, we are going to use the B button to add wind to the left and the Y button to add wind to the right.

Checking if a button is pressed on the Pico Display can be done with a call to pico_display::is_pressed():

if (pico_display.is_pressed(pimoroni::PicoDisplay::X)) {
  // Add button handler code here.
}

The problem with this method is that, since we run this every frame, the wind will increase very quickly, even when pressing the button for a short period of time.

Instead, what we want, is to increase/decrease the wind when it button gets pressed - more cleary, when it changes state from "not pressed" to "pressed":

bool y_pressed = false;
bool b_pressed = false;

while (true) {
  if (!y_pressed && pico_display.is_pressed(pimoroni::PicoDisplay::Y)) {
    // Button Y changed state from "not pressed" to "pressed".
     wind++;
  }
  y_pressed = pico_display.is_pressed(pimoroni::PicoDisplay::Y);

  if (!b_pressed && pico_display.is_pressed(pimoroni::PicoDisplay::B)) {
    // Button B changed state from "not pressed" to "pressed".
     wind--;
  }
  b_pressed = pico_display.is_pressed(pimoroni::PicoDisplay::B);
}

We can then apply wind to our logic:

void update(uint32_t time) {
    for (int y = 0; y < pimoroni::PicoDisplay::HEIGHT; y++) {
        int row = y * pimoroni::PicoDisplay::WIDTH;
        int next_row = y == 0 ? 0 : (y - 1) * pimoroni::PicoDisplay::WIDTH;

        for (int x = 0; x < pimoroni::PicoDisplay::WIDTH; x++) {
            uint8_t color = fire[row + x];
            uint16_t pen = pallete[color];
            pico_display.setPen(pen);
            pico_display.setPixel(x, y);

            if (y > 0) {
                int new_x = x;
                int rand = fast_rand() % 3;
                new_x = (new_x + rand - 1 + wind);
                if (new_x >= pimoroni::PicoDisplay::WIDTH) {
                    new_x = new_x - pimoroni::PicoDisplay::WIDTH;
                } else if (new_x < 0) {
                    new_x = new_x + pimoroni::PicoDisplay::WIDTH;
                }
                color = color > 0 ? color - (rand & 1) : 0;
                fire[next_row + new_x] = color;
            }
        }
    }
    pico_display.update();
}

Another modification is that we now "wrap around" the fire spread: If a pixel at the first column spreads to the left, we teleport that pixel to the last column and if a pixel at the last column spreads to the right, we teleport that to the first column.

More perf improvements

These extra checks mean that our FPS to a hit again, and we're now back to 21 FPS. The next improvement is a trick around the pico_graphics API.

When setPixel(x, y) is called, the API will check boundaries to ensure that the values are not written outside the frame_buffer boundaries. In our case, and after implementing the "wrap around" for the wind, we know we will never write outside the boundaries.

So, instead of calling setPixel(x, y), we invoke the ptr(x, y) function, which allows manipulating the framebuffer directly, skipping the boundary validation:

void update(uint32_t time) {
    for (int y = 0; y < pimoroni::PicoDisplay::HEIGHT; y++) {
        int row = y * pimoroni::PicoDisplay::WIDTH;
        int next_row = y == 0 ? 0 : (y - 1) * pimoroni::PicoDisplay::WIDTH;
        for (int x = 0; x < pimoroni::PicoDisplay::WIDTH; x++) {
            uint8_t color = fire[row + x];
            uint16_t pen = pallete[color];
            *pico_display.ptr(x, y) = pen;

            if (y > 0) {
                int new_x = x;
                int rand = fast_rand() % 3;
                new_x = (new_x + rand - 1 + wind);
                if (new_x >= pimoroni::PicoDisplay::WIDTH) {
                    new_x = new_x - pimoroni::PicoDisplay::WIDTH;
                } else if (new_x < 0) {
                    new_x = new_x + pimoroni::PicoDisplay::WIDTH;
                }
                color = color > 0 ? color - (rand & 1) : 0;
                fire[next_row + new_x] = color;
            }
        }
    }
    pico_display.update();
}

This got us over 40 FPS, which is more than 27 FPS required by doom-fire. Yay!

Conclusion

The Raspberry Pi Pico and the Pico Display are incredibly fun to play with.

While MicroPython is easy to get started and prototype, it has a significant performance cost.

C/C++ is more complex to setup and probably has a steeper learning curve, but it can payoff if the extra performance is needed.

I'm not an expert in Python or C++ but, if you want to check out the code, head over to the Github repo and drop issues or even pull-requests.