Brainroll Postmortem Part 3: Engine

Nov 27, 2023

It is finally time to talk a bit about the thing I feel most comfortable with, the tech that powers Brainroll. I plan on touching a bit on the stack, engine, architecture and how I solved some of the game logic and mechanics of the game. I also want to talk about some things that the good, bad and what I will think about for my next projects. As I was preparing these posts I noticed how it quickly grew in length which made me decide to split it into even more parts. For this first part we will only focus on my engine. Since it is a lot of stuff to cover if I were to go into every detail of it I decided to just write about the things I think is interesting or that I didn’t see that many others write about.

Engine

Brainroll is built within my own “engine” called Maraton. I put engine in quotation marks because as of late I have been viewing my engine as more of a framework as it doesn’t encapsulate any behind the scenes logic or systems and is right now only controllable through code just like any other game framework. But the ambition is to turn it to a full engine some day. It is built on things I’ve learned while being a part of the Handmade Network which I found through Casey Muratori’s Handmade Hero. Since then I’ve also been very inspired by Ryan Fleury which has explained very interesting ways of solving certain problems we’ll talk about later.

Engine & Architecture

Maraton is built in a C style of C++ where I use a very slim subset of C++ features. The reason behind this is mainly that compile times are drastically reduced by ignoring the C++ standard library and features such as templates, in addition I have found C style code is also many times easier to debug than C++ heavy code. I personally also work better with a more limited set of syntax and language features as I am a person that tend to dive into a lot of rabbit holes when there are too many ways of solving the same problem.

As a reference the C++ features that I use are mainly:

const references - Indication that value passed to function cannot be invalid and the function does not mutate this value.
Lambdas - Used in a very few places to make big functions easier to read by having helper functions spatially nearby.
templated struct - I use this at once place in the engine to implement a Defer macro.
C++ style casts - Just something I experimented with but decided not to bother with. Still might find some of them in there.

I have previously introduced the structure of the engine in an older post which is now a bit outdated. Since then changed the structure a bit in preparation of the Brainroll release as well as my future titles.

The idea of this diagram is that its layered and your game is built on top of the stack while the different submodules depend on the modules underneath them, for example the 2D rendering system depends on the OpenGL abstraction layer as well as the platform layer which in its turn depends on the Win32 layer. This has worked great so far but I am still in the process of changing this entire structure into be even more modular because in the future I want them to be even more loosely coupled with eachother. I want every single subsystem to be its own self-contained module, it may depend on another module but you as a user of the engine should be able to use it standalone as well.

The engine does the essentials for me in order to get to work relatively fast. Here is an overview of what the engine does “behind the scenes” during a frame disregarding any of the rendering, font and ui modules.

Update timers.
Check for input (Mouse, Keyboard, Controller & Touch).
Update window size.
Prepare audio buffers.
Call into game update code.
Check if any settings was changed and update them if needed (VSYNC, Fullscreen).
Write to audio buffers with sound from the game.
Hot-reload game code if possible.
Refresh screen.
Update timers and sleep if needed to hit target FPS.

As you can see its not really anything fancy but it does the important stuff that is required between game projects. I have actually very rarely had to modify this if ever. There was one project where I built a game requiring multiple windows and that game used two of these update loops, that was probably the only time I really had to make a “big” change to this system and all I did was basically to run this update loop twice.

Hot reloading

As a user of the engine you will first and foremost see the project divided into two parts, the game specific code and then the platform specific code, there is a barrier between these two and a platform layer that ties them together. The way this works is that the platform specific code includes the entrypoint for the application, this is compiled into an executable while the game code is built into a .DLL that is loaded by the platform specific code in runtime. It is mainly organized this way to allow for hot-reloadable code. This means that I can re-compile the game while its running and have it reload the DLL with the new version in runtime which allows for faster iterations.

The way this works is quite simple, all you need to do is make the platform specific code be in charge of the memory that drives the game, you allocate the entire game’s state ahead of time inside the platform specific code and then passes it down to the DLL for it to use. Then every frame inside the platform specific code you can have it compare timestamps on the DLL to see if there was a new version available to use.

STN_INTERNAL b32
Win32GameLoad(win32_game *Game)
{
    b32 Result = false;

    WIN32_FILE_ATTRIBUTE_DATA Ignored;
    if (!GetFileAttributesEx(GlobalLockFullPath, GetFileExInfoStandard, &Ignored))
    {
        if (CopyFile(GameDLLPath, GameAppDLLPath, FALSE) == 0)
        {
            Win32FatalError("Failed to load Game.", 
                            "CopyFile failed between \"%s\" & \"%s\"", 
                            GameDLLPath, TempGameDLLPath);
        }

        Game->LastDLLWriteTime = Win32GetLastWriteTime(GameDLLPath);        
        Game->DLL = LoadLibraryA(TempGameDLLPath);

        if (Game->DLL == NULL)
        {
            Win32FatalError("Failed to load Game.", 
                            "LoadLibraryA failed for \"%s\"", TempGameDLLPath);
        }

        if (!Game->DLL)
        {
            Result = false;
            return (Result);
        }
    
        // NOTE(Oskar): Store function pointers to public functions in the DLL
        // I've removed some of the functions I have to make the code shorter.
        Game->Update         = (GameUpdateCallback *)GetProcAddress(Game->DLL, "Update");

        Result = true;

        // NOTE(Oskar): If any of your functions failed to load
        if (!Game->Update)
        {
            Game->Update         = GameUpdateStub;
            Result = false;
        }
    }
    
    return (Result);
}

How this code snippet works is basically that we don’t use the game’s dll because that would lock the file which wouldn’t allow the compiler to delete it in order to create the freshly compiled dll. So what we do is that we copy our game.dll into a new file called temp_game.dll and load that file into the engine. Then we can write another small snippet to compare the write times with the game.dll on disk and if it is newer we call this function to reload.

There is a slight problem that you may encounter, that is that msvc won’t allow you to overwrite the .pdb files when applying this approach. There is a small hack that I learned from Handmade Hero where we can get around this issue in our build script:

Delete all previouly generated .pdb files.
Create a temporary file to indicate that we’re in the process of building (GlobalLockFullPath in the snippet above)
Build again but generate something random or unique as part of the .pdb file.
Delete the temporary file you created before.

I use batch scripts when building my engine on Windows and this is how it can look when writing this in batch:

if not exist build mkdir build
pushd build

del *.pdb > NUL 2> NUL
echo WAITING FOR PDB > lock.tmp

start /b /wait "" "cl.exe"  %build_options% %compile_flags% ../src/game/maraton.cpp -Fmgame.map /LD /link -PDB:%dll_name%_%random%.pdb %common_link_flags% /out:%dll_name%.dll

start /b /wait "" "cl.exe"  %build_options% %compile_flags% ../src/engine/win32/win32_maraton.cpp maraton.res -Fmwin32_maraton.map /link %platform_link_flags% /out:%application_name%.exe

del lock.tmp

popd

Batch uses %indentifier% as variable names and %random% is some sort of builtin variable that gives a random number. You can see in this snippet how we add the following flag to msvc -PDB:%dll_name%_%random%.pdb .

This system may be a bit cumbersome at first as a new user because you don’t have the game’s memory available at your fingertips. You have to ask the platform for more memory if you were to need it. If you don’t and just allocate your memory from the DLL it will be invalid upon a reload. I am a bit lucky because Brainroll is a game where I could easily always know how much memory the game needs so It has a fixed pool allocated on startup and that pool is just re-used for everything during the entire runtime of the game, there is however an exception to this that I will get into later. Worth to note that this only applies if you are using hot reloading, if you were to not use it then you are free to allocate memory wherever you want.

Over time I have come to dislike this barrier and hence I am going to remove the hot-reloading functionality from my engine as a part of the project making it more modular. That is until I figure out how to solve this problem in a good way. Overall I did have very good use of this while developing Brainroll, especially while making levels as the earlier versions of Brainroll required a recompile to alter the levels this was a very good way of making changes while playing the game at the same time.

UI & Font

The UI subsystem in Maraton I have covered before, if you want a better explanation and deep dive into the system that I implemented it is available on Ryan Fleury’s website. That is why I will put more focus on how I solved the font related stuff, because I have not talked about it before. I decided to not use a library and implement my own functionality for baking a font atlas. In the past I relied upon the stb_truetype.h library which works really well but I experienced some issues when scaling my text for different sizes, more specifically very small text which made me just want to roll my own system.

I went with using DirectWrite as a backend, I made this decision based on that most of the other stuff I use is Windows specific (WASAPI, XInput, etc) in my engine and I feel a bit familiar with how they structure their APIs. DirectWrite is only used to rasterize the font into a texture, during the game’s runtime the font will be rendered like any other texture through my own rendering pipeline. The reason that we even have to use a library to do this for us is that font rendering is extremely complicated to get right. Ignoring the fact that rendering glyphs for other languages such as Arabic is completely different to English we also have things such as anti-aliasing to worry about and I really didn’t feel like diving into that.

DirectWrite comes with its own subpixel rendering technique called ClearType which I was pretty excited to make use of in my engine. ClearType can help making the text a bit smoother by combining an sub-pixel rendering technique with how the individual RGB leds in a LCD screen works to make the text look improved.

a) Text rendered without ClearType. b) Text rendered with ClearType. Image from Wikipedia licensed under CC BY-SA 3.0

The way I want to represent a Font within the engine is very simple, what I need to know is some information about each glyph, such as its dimensions and then I want to write each glyph into one big texture so that I can batch draw text through instanced rendering later on. To do this I also need to store the UV coordinates for each glyph to know where in the texture atlas the glyph is located. We also add a small offset because there needs to be some space between each glyph so that they don’t bleed into eachother. The Font also has some important values to keep track of such as Ascent, Descent & Linegap which allows you to calculate the height of text if you are going to write several lines of text on the screen.

struct glyph_metrics
{
    f32 OffsetX;
    f32 OffsetY;
    f32 Advance;
    f32 XYW;
    f32 XYH;
    f32 UVX;
    f32 UVY;
};

struct font
{
    glyph_metrics *Metrics;
    int32_t GlyphCount;

    f32 Ascent;
    f32 Descent;
    f32 LineGap;
    f32 PointSize;

    opengl_texture Texture;

    void *Face;
};

The steps of actually rendering a full font into an atlas is not completely trivial. There are a lot of setup before we can start using DirectWrite to actually render our glyphs for us. First we need a factory that is used to create all our DirectWrite specific structures. Using that factory we need to tell it which font file we want to read and using that we can then create a font face which is the in-memory representation of a renderable font. This font face is something that we will keep around during the lifetime of the font in order to look up the indices into our glyph metrics for specific glyphs. This can be removed if we implement some sort of cache but I have not got that far yet.

HRESULT Error = 0;

IDWriteFactory *Factory = 0;
Error = DWriteCreateFactory(DWRITE_FACTORY_TYPE_SHARED, __uuidof(IDWriteFactory), (IUnknown**)&Factory);

IDWriteFontFile *FontFile = 0;
Error = Factory->CreateFontFileReference(FontPath, 0, &FontFile);

IDWriteFontFace *Face = 0;
Error = Factory->CreateFontFace(DWRITE_FONT_FACE_TYPE_TRUETYPE, 1, &FontFile, 0, DWRITE_FONT_SIMULATIONS_NONE, &Face);

Unfortunately DirectWrite doesn’t clean itself up after use so we need to actually call each of these objects Release method. This is one of the places where I found the defer macro that I refered to earlier be of good use. You may also have noticed that I’ve included an Error variable in the snippet, this is also something that we need to check. I did not include error handling nor cleanup in the code here just to save space.

Moving on we need to specify some parameters to tell DirectWrite how it should render our glyphs. I just decided to use the defaults that is set in the Windows control panel. Then we need a method of rendering glyphs into a bitmap I decided to use GDI, I didn’t explore the alternatives here so the decision is arbitrary. GDI is used by creating a GDIInterop which acts almost like the factory that we created earlier, we will only use it in order to create a bitmap render target.

FLOAT Gamma = 1.0f;
IDWriteRenderingParams *DefaultRenderingParams = 0;
Error = Factory->CreateRenderingParams(&DefaultRenderingParams);

IDWriteRenderingParams *RenderingParams = 0;
Error = Factory->CreateCustomRenderingParams(Gamma,
                                                DefaultRenderingParams->GetEnhancedContrast(),
                                                DefaultRenderingParams->GetClearTypeLevel(),
                                                DefaultRenderingParams->GetPixelGeometry(),
                                                DWRITE_RENDERING_MODE_NATURAL,
                                                &RenderingParams);

IDWriteGdiInterop *DWriteGDIInterop = 0;
Error = Factory->GetGdiInterop(&DWriteGDIInterop);

We can now retrieve some metrics regarding the font to know how big bitmap we are going to have to allocate. We will draw each glyph to the bitmap and then copy it over to our bigger texture atlas.

Baked font atlas. Notice how much space is wasted (topic of future post).

DirectWrite uses design units in order to make it portable between differently scaled devices. This means that we will have to convert the design units into pixel units when rasterizing the font. I’ve created a small dictionary for the keywords that is used. Most of this is from here.

Design Unit - Abstract unit independent of screen or text size and varies in resolution between fonts.
Em - Unit that scales relative to the visual size of the text.
Point - Fixed unit of physical length. This is 1 / 72 Inch
Design Unit / Em - The scale of a font's design unit. This exists within the Font Metrics.
Point / Em - The point size of text. For example 12pt text is it 12 Point / Em.
Inch / Point - Always 1 / 72
Pixel / Inch - Also known as DPI. Default is 96 if your application is not DPI aware.

f32 PixelPerEM = PointSize * (1.0f / 72.0f) * DPI;
f32 PixelPerDesignUnit = PixelPerEM/((f32)FontMetrics.designUnitsPerEm);

i32 RasterTargetWidth  = (i32)(Padding * ((f32)FontMetrics.capHeight)*PixelPerDesignUnit);
i32 RasterTargetHeight = (i32)(Padding * ((f32)FontMetrics.capHeight)*PixelPerDesignUnit);
f32 RasterTargetX      = (f32)(RasterTargetWidth  / 2);
f32 RasterTargetY      = (f32)(RasterTargetHeight / 2);

IDWriteBitmapRenderTarget *RenderTarget = 0;
Error = DWriteGDIInterop->CreateBitmapRenderTarget(0, RasterTargetWidth, RasterTargetHeight, &RenderTarget);

// GDI based HDC that allows us to make GDI calls to render.
HDC DC = RenderTarget->GetMemoryDC();

Since we will write several glyphs to the same bitmaps we need to clear it per iteration, for this we have a small helper function that clears the render target that we’ve created.

void
_DirectWriteClearDC(HDC DC, COLORREF Color, uint32_t L, uint32_t T, uint32_t R, uint32_t B)
{
    HGDIOBJ Original = SelectObject(DC, GetStockObject(DC_PEN));
    SetDCPenColor(DC, Color);
    SelectObject(DC, GetStockObject(DC_BRUSH));
    SetDCBrushColor(DC, Color);
    Rectangle(DC, L, T, R, B);
    SelectObject(DC, Original);
}

Next all we need to do is to go over each glyph, render it to our bitmap and then copy it over to our texture atlas while also filing in our glyph_metrics.

Font.GlyphCount = Face->GetGlyphCount();
u32 Column = 0;
u32 Row    = 0;
for (u16 GlyphIndex = 0; GlyphIndex < Font.GlyphCount; ++GlyphIndex)
{
    // NOTE(Oskar): Render glyph into RenderTarget
    DWRITE_GLYPH_RUN GlyphRun = {};
    GlyphRun.fontFace = Face;
    GlyphRun.fontEmSize = PixelPerEM;
    GlyphRun.glyphCount = 1;
    GlyphRun.glyphIndices = &GlyphIndex;
    RECT BoundingBox = {0};
    Error = RenderTarget->DrawGlyphRun(RasterTargetX, RasterTargetY,
                                        DWRITE_MEASURING_MODE_NATURAL, 
                                        &GlyphRun, RenderingParams,
                                        RGB(255, 255, 255), &BoundingBox);

    DWRITE_GLYPH_METRICS GlyphMetrics = {};
    Error = Face->GetDesignGlyphMetrics(&GlyphIndex, 1, &GlyphMetrics, false);

    i32 TextureWidth  = BoundingBox.right - BoundingBox.left;
    i32 TextureHeight = BoundingBox.bottom - BoundingBox.top;

    // Fill in Font.Metrics[GlyphIndex] for this glyph.

    HBITMAP Bitmap = (HBITMAP)GetCurrentObject(DC, OBJ_BITMAP);
    DIBSECTION DIB = {};
    GetObject(Bitmap, sizeof(DIB), &DIB);

    // Copy bytes over to Atlas here.

    Column++;
    if ((Column * GlyphSize) >= AtlasWidth)
    {
        Column = 0;
        Row++;
    }

    _DirectWriteClearDC(DC, Win32DirectWriteBackColor, 0, 0, RasterTargetWidth, RasterTargetHeight);
}

The font metrics is really simple to fill in, you just specify the dimension of the texture and its UV coordinates into the atlas. In order to copy the pixels from the bitmap into your atlas its as simple as looping over every x and y coordinate in the bitmap. In order to access the pixels from the DIBSECTION you can use DIB.dsBm.bmBits. I have not included the exact code used in Brainroll because I am going to write a separate post about how my texture atlas is allocated.

This implementation have really helped me to make text look really crisp, especially when scaling the font way down. In the future I plan on implementing more backends here in order to allow the user of the engine decide to use DirectWrite or FreeType just like you might want to use D3D over OpenGL.

Audio, Renderer, Memory, Crash reporter, misc

The remaining systems of the engine are in my opinion not that interesting to go over. How I’ve implemented audio is very basic, it doesn’t do anything fancy all it does is mix audio samples in a buffer and sends it to WASAPI, I didn’t even bother with putting it in a separate thread yet. Same is basically true for the rendering system. I might be biased as almost everyone I know uses some sort of pushbuffer for their rendering system which is something I first learned from Handmade Hero.

The idea is to have an immediate API but defer the actual rendering to the end of the frame. This is done by pushing the task to render something into a buffer. At the end of a frame you can in a very simple way loop over all your render tasks and the ones that are performing the same task can be instanced. For example if I render the same texture many times after eachother I can render them all in just one draw call which is very efficient. The same works for the renderers internal state, I can push a task for it to set what OpenGL framebuffer to activate or what blending modes to toggle on or off and then it will do everything it has to do at the end of the frame.

Since I created mine then there has been open source implementations released of the same approach, one is SimplyRend made by Wassimulator who is the developer of AV-Racer. This library shares many of the ideas I have in my system.

I am a paranoid person, I think every developer about push the release button for their project is. What happens when someone starts the game on their machine, does it even start? I question myself things like that all the time. Luckily one of the developers of Happenlance, Phillip Trudeau wrote a great post about how he uses discord webhooks as his crash reporter. I implemented this system and it has worked like a charm, the simple idea is that if the game crashes we save a crash dump and send it to a private discord channel where I can download and debug it.

Thoughts

Moving back to the post mortem perspective of the post I believe that using my own engine has its up and downsides. First of all the positives are that I am in control of my own tech, this has always been a goal of mine and it really improves my experience as a developer. In order to have this luxury I did sacrafice development time, I am confident that games can be faster developed using a pre-built engine. I believe if my goal was to get my company started and just focus on pushing products I would be much better of using pre-existing tech and having my engine as a side project, it would allow me to focus on building a great game instead of spending a lot of time programming things that might not matter but at the same time I don’t want to do that as it would remove the fun for me.

Another downside is that there is noone to ask, the documentation is as existing as I personally made it. If something is working I have to solve it myself. While cumbersome this is also very valuable for learning experience, I think that I have become much more hardened as a developer because of this experience. Ever since I started this project my life as a professional developer has improved by a lot, I have managed to get a well paying role that I wouldn’t have gotten without the knowledge I’ve gathered from this project. In the future I will continue to work on my own engine but I will be open to take it in a direction where it does more work for you in order for me to be more effective when working on a new product. This I think will be even more valuable when bringing in other people into the project.

For this to work I think I need to be more strict with how I allocate my own time. I need to separate engine and game in order to not get stuck working too much on my engine and also make progress on the game projects. I consider myself a programmer-first, it is what I am best at and most comfortable with hence I have the tendency to stick my head under the sand and just tunnel vision on programming problems. Working on Brainroll has helped me a lot with this problem however, I feel like building a game from start to finish really gave me some important insights in what it really takes to get to the finish line.

Did you know that you get access to the Maraton source code if you become a paid subscriber to my website? If you’re interested in learning or to be a part of the development then don’t hesitate and click the button below!

💻GITHUB 🐦TWITTER 🗨DISCORD 📹YOUTUBE

Nullsson

Discussion about this post