Category Archives: programming

Anatomy of a bug fix

February 09, 2017 | Filed under: production line | programming

I thought it might be interesting to just brain dump my thoughts as I do them for a Production Line bug.

The bug is an error message which I get triggering thus:

GERROR(“No room to TrashLowPriorityResource from stockpile”);

Which is handy, as I know exactly WHAT is happening, and where in the code. Thats 95% of the work done already. Amen to asserts! I know that inside the function SIM_ResourceStockpile::TrashLowPriorityResource(), I am not able to actually do what the function needs to do. So firstly what does this function do anyway? I can’t remember… Thankfully there is a comment:

//if we find a resource not needed by current task, destroy it (only one)

Which is pretty handy. So this is a stockpile at a production task (working on a vehicle) and it needs room for a new resource presumably. The code goes through all of the objects currently in the stockpile to check we really need them. here is the code:

    iter it;
    for (it = Objects.begin(); it != Objects.end(); ++it)
    {
        SIM_ResourceObject* pobject = *it;

        //is this object needed?
        bool bneeded = false;
        SIM_ProductionTask::iter rit;
        for (int c = 0; c < SIM_ProductionSlot::MAX_RESOURCE_QUANTITIES; c++)
        {
            SIM_ResourceQuantity req = PSlot->CurrentResources[c];
            if (req.Quantity <= 0)
            {
                continue;
            }
            if (req.PResource == pobject->GetResource())
            {
                if (GetQuantity(req.PResource) > req.Quantity)
                {
                    bneeded = false;
                }
                else
                {
                    bneeded = true;
                }
            }
        }
        if (!bneeded)
        {
            RemoveObject(pobject->GetResource());
            return;
        }
    }

Thats the first half of the function, it then tries to do a simialr thing with requests, rather thanm objects. However I think I see an issue already. That GetQuantity() is checking the total quantity, not for this single object. Nope…thjat makes sense. I guess I really need to trigger the error and step through it, so time to fire up a save game from a player and run it in *slooooow* debug mode. Ok… an immediate crash from a save game..where we have 16 requests and no objects. So there seems to be a bug in the latter code:

    //ok we need to destroy a request instead of a current object.
    reqit qit;

    for (qit = Requests.begin(); qit != Requests.end(); qit++)
    {
        SIM_ResourceRequest* preq = *qit;
        //is this object needed?
        bool bneeded = false;
        for (int c = 0; c < SIM_ProductionSlot::MAX_RESOURCE_QUANTITIES; c++)
        {
            SIM_ResourceQuantity req = PSlot->CurrentResources[c];
            if (req.Quantity <= 0)
            {
                continue;
            }
            if (req.PResource == preq->PObject->GetResource())
            {
                //sure we ordered it, but do we actually have enough for this task, when com,bining requests with objects?
                int stock_and_ordered = GetStockAndOrdered(req.PResource);
                if (stock_and_ordered >= req.Quantity)
                {
                    Requests.remove(preq);
                    preq->PObject->Trash();
                    return;
                }
                bneeded = true;
            }
        }
        if (!bneeded)
        {
            Requests.remove(preq);
            preq->PObject->Trash();
            return;
        }
    }

So what could be wrong here? Ok…well hang on a darned minute. This code is just saying ‘do we really need resources of this type right now???’ rather than ‘do we need ALL of these right now?’  which is more of a problem. We are clearly somehow ‘over-ordering’. The slot in question is ‘fit trunk’. presumably it only needs trunks… They do all indeed appear to be the same type. 4 are ‘intransit’ the rest are queued up at some resource importer or a supply stockpile or component stockpile somewhere.

So the ‘solution’ is easy. First kill off (delete) a request that is not in transit yet, to make room, or if that is not possible, kill one that actually is in transit. This then frees up the required place in our stockpile. But hold on? this is just patching over the wound? how on earth did this happen? The calling code is requesting new resources, when clearly 16 are already on the way! in fact its requesting a servo…

AHAHAHAHA!

what a beautiful bug. The player obviously upgraded to a powered tailgate, which means a servo is needed for every trunk, and yet 16 trunks were already on order. Error! error! I have special code that handles when an upgrade makes an existing resource requirement obsolete (climate control replaces aircon), but not code to handle this case, where additional stuff is needed. This is FINE, I can implement the above mentioned code without worrying how it happened.

That actually wasn’t too bad at all :D

What is it with TV executives and non-tech savvy journalists? Can they not do the vaguest bit of research and kill of this myth about ‘gifted child genius programmers and hackers’? its so off-base its laughable, especially to anyone my age who works as a software engineer. If it isn’t immediately obvious what I’m talking about, its characters like this from silicon valley:

And also like this (also from silicon valley)*

And like this from ‘halt and catch fire’

And any number of media stories about ‘teenage bedroom genius hackers’. I guess it all goes back to a single film, in the early days of computers (and the threat of hacking)…war games, starring Matthew Broderick aged 19 released in 1983. The myth of the young genius computer whizz was born, and nobody has seemingly challenged it since.

Firstly…lets get something straight., The ‘cleverest’ programmers are not usually ‘hackers’. Firstly, its much easier to break something than build it. You build software with 100,000 lines of code and 1 line has a potential exploit? you did a good job 99,999 times, versus a hacker who finds that one exploit. are the finest minds in programming really working for the Russian mafia? I suspect they are more likely to be working for Apple or Deep Mind or some tech start-up with fifty million dollars worth of stock options. They get better pay and no threats of violence, which would you choose?

Secondly, computers were invented a while ago now. We have people with a LOT of experience in the field out there now. Amazingly, C++ is still perfectly usable, and very efficient, and given the choice between someone who has written tens of millions of lines of C++ over twenty or thirty years, versus some ‘bright’ kid…I’m going with the old guy/girl thanks.

Learning to code takes TIME, yet because bookshops hawk crappy ‘learn C++ in 21 days (or less)’ bullshit, some non-coders actually believe it. There is a BIG difference between ‘knowing some C++’ and being a C++ software engineer. Writing code that works is fucking easy. Writing reliable bug-free efficient, legible and flexible and safe code is fucking hard. Why do we think that surgeons with 20 years experience are the best choice for our brain operation, yet want software coded by a fourteen year old? Is there some reality-distortion field that turns programming into a Benjamin button style alternate reality?

No.

So ideally, any movie or TV series that features the ‘ace’ coder would have them aged about 30-40, maybe even older. At the very least they would be in the darned twenties. Enough with the school-age hacker god bullshit. Here is a recent picture of John Carmack. I bet he is a better coder than you, or me. He has even more grey hair than me.

While we are on the topic, the best coders are not arrogant, mouthy, uber-confident  types on skateboards wearing hip t-shirts with confrontational activist slogans on them, and flying into a rage whenever people talk to them. Nor do they always blast out heavy metal or rap music on headphones whilst coding on the floor cross-legged, and nor do they ‘do all their best work’ when on drugs, or at 3AM, or after a fifteen hour coding blitz.

These are myths that make TV characters ‘more exciting’. Except they also make them unbelievable and stupid. I’m quite unusual in being a fairly extravert (in short bursts) programmer. Put it down to being a lead guitarist in a metal band 27 years ago. Most really *good* coders I know are actually pretty quiet. They will not draw attention to themselves. they are not arrogant, they know enough to know that they know very little. Really good coders tend not to brag. I brag a bit, its PR but would I claim to be a C++ *expert*. Nope, I know what I need to know. I also only really know C++, a little bit of PHP, and some HTML, CSS, but not enough to do anything but the few things I need. When I meet coders who brag that they know 10 languages, I get that they know the syntax, but how to use them effectively? enough to write mission critical code that a company is built on? I find it hard to believe.

Most coders look pretty boring. Most of us are pretty boring. Most of us are not arrogant shouty attention seekers. The experienced ones know to stop coding by 9PM at the very latest, and to take regular breaks. We also aren’t stupid enough to store backup disks next to hi-fi speakers in the same room (an actual plot point in halt and catch fire). We make shit TV, but good code.  I suspect our portrayal will never change.

 

*All the SV cast are young, but carla seems to be portrayed as younger, cooler, more confident than the rest.

A mixed bag of stuff this week. I’m crunching away boih at bugs, and balance issues, which make for a poor video, but also luckily we now have lot-expansion plus some zoom-out UI shininess to look at, plus I fire up nvidia nsight to give you a quick preview of how a frame gets rendered in the game. (FWIW I use my own engine, based on directx9 which means I need to worry about draw calls when changing textures. One day I’ll switch to a newer directx version, but I feel no pressing urge to do so yet :D.

Enjoy:

There are no comments yet

This is the short ‘story so far’ on animation compression for Production Line, my next game. I use my own engine that runs on directx9,a nd the game is isometyric in style, so uses a lot of spritesheet-style animation frames. In short, this is about how I enable animations like the one below (excuse the crappy gif compression, actual image is less pixelated in the real game), to fit into tiny amounts of video memory. (The disk version of that anim is 4MB, compressed for download its 355k). This is an ‘idle’ anim of a marketing manager in the game checking his phone to make all-important marketing related phonecalls…

marketing

You might think that cannot possibly be a 4MB animation, but you would be dead wrong. The actual source graphic is 128 wide by 256 high, after cropping, meaning that a frame of it gets delivered from the artist looking like this:

people_business_phone_sw_0218

Thats no problem, but a complete animation cycle is 300 frames long, and I need two versions, one to look towards, one away (I can flip in X axis for the others). 128 x 256 x 300 is 9.8 million pixels, or a 4096 x 2,400 pixel sprite sheet, which takes up 39MB of VRAM. Assume both directions are in use = 80MB, 3 different characters is 240MB. Thats with one skin color and one gender. Ouch. So obviously we need to compress it.

I rolled my own compression system for fun, and to give me total control over how it works. The first step is to cut out any dead space and remove any duplicates. When our little dude is on the phone, the image doesn’t change, so we have a lot of duplication in that colossal theoretical sprite sheet. How did I approach this?

I decided upon a 64 chunk ‘grid’ for any image that would be animated, dividing it into 8 sections horizontally and vertically. With the raw image I get given, that gives me this:

grid

Actually this is annoying because his hair only just clips into another row, which is inefficient… never mind. The most obvious thing here is that a lot of the animation frame is just empty space. In order to kill both the ’empty space’ and the ‘same as last frame’ problems in one go, I work out a CRC (basically a unique signature) for each of those grid squares, for each frame, and store it in a great big list of data as a pre-processing step.

With all that data in memory, I then go through each chunk of each frame, and look for previous frame chunks with the same CRC. If I don’t find any, I mark it as a ‘used’ chunk. if I do, I just make a note of which earlier chunk to use, and increment the use count on the old chunk. Once this is done, I can go through all the chunks and discard the ones that have a zero use count (or are blank).
Stage two is to create a whacking great spritesheet of all of the chunks that I actually *do* use, which looks like a weird mashup of imagery like this:

sheet

(It also has an alpha channel). I can then go through all of those ‘used’ chunks and make a note of the UV values for each of them in my new spritesheet. Now, the final stage is to go through every frame of the animation, and make a note of which chunks actually get drawn, and the index into my list of ‘used’ chunks. This means I now have a big text file for the animation that tells me which of the 64 maximum chunks of the spritesheet I need to draw in each frame. In the case of this phone-dude, that shrinks the actual texture memory usage by 95%, meaning I can easily have a bunch of different animations.

Thats where I am right now, and its pretty fluid, and works without bugs. The next step will be optimizing the code, rather than the source. An optimisation I managed today was to actually successfully only draw the non transparent chunks when I do the rendering of each frame, which reduces the number of polys and draw calls. The next obvious optimization is to spot when a chunks ‘source id’ is the same as the previous frame, and then not bother updating it at all. Right now, I redo the UV lookup and apply the values every frame for this dudes legs, even though they don’t move during the whole animation. Thats a bit too much processing for my liking.

I’m sure middleware *does* exist that does this, but I like to code stuff myself so I know 100% how it works, 100% how fast/slow it is, and can ensure compatibility with all the rest of my code and workflow. I’ve probably spent a week doing this, maybe a bit more, but I have a pretty cool system now that lets me knock up a 30 secon text file when I get a new bunch of animation source files, then hit the launch button and the game will do all the preprocessing and give me the compressed data automatically. Yay.

This also means that with only 4MB+4MB of VRAM for an animation for a character, I could double the characters, and have both genders without running out of VRAM. This also makes the game more usable on low end PCs. Now the limit is my art budget, rather than the hardware :D Anyway here is a reminder of the final video, and also that this is for my car factory game: Production Line

marketing

Occasionally I see some interesting posts on reddit or gamasutra or some other dev site where programmers are asking about stuff to do with game development, and I find myself thinking ‘these aren’t the things you really ned to know’, but because its such a DIFFICULT thing to articulate, I never even try. I’m going to try now…and it will be messy.

 

A long time ago, when the world was in black and white and you could get your milk delivered even in the cities, I went for a job as a game programmer with a smallish UK game studio. I didn’t get the job. I remember in the interview being asked about my code, and in some vague way trying to explain that I knew what I was doing and a proper software developer. I waffled really badly about how giving sensible names to variable meant that I knew what I was doing and not some n00b.

Hahaha.

In my mind…this:

for(int counter = 0; counter < total; counter++)


Made me so much better a software engineer than

for(int x = 0; x < total; x++)


That just goes to show how much I had to learn. The real problem is that the stuff you need to learn is big ‘hand-wavey’ stuff that its really hard to explain, let alone teach! if you haven’t already read a book like Code Complete, Now would be a good time. I will try and explain the sort of stuff that it really helps to learn, and it will be vague and awkward, and hopefully that kind of makes my point about how hard it is to grok this stuff.

I have a class in my latest game called GUI_Tile. Its the visual representation of an actual tile in the game world called SIM_Tile. Each SIM_Tile has a GUI_Tile, and anything that is on-screen is handled by GUI_Tile. In order to draw my world, I go through all of the GUI_Tile objects and call draw on them. Actually, thats what I used to do, but it turns out there are a lot of them and thats fairly time consuming. When I profile the code, looking at where the time is spent, I realize that a lot of it is not actually directx calls, but its transform stuff, and scaling stuff, and working out the sizes of sprites based on some animation they are doing, and their offsets and whether or not they are even on-screen given the current camera position. The actual DrawSprite() bit of each GUI_Tile call is only the last statement…kind of a footnote.

While all this is going on, the other 7 effective cores of my CPU are chatting about whats on TV and picking fluff from their navels.

So clearly, I should find a way to multi-thread all this, even though I’m using Directx9 which is effectively single-threaded.

Soo… I could create a new thread at the start of the app, and make that my directx render thread. it could then do nothing but make render calls, and all the other stuff could happen in another thread, thus allowing me to double up on what was going on. To keep things from going crazy, and me updating a sprites co-ordinates just as directx rendered it, I need a hard STOP! (or mutex etc) at the time I start rendering, but it will still allow some overlap surely? I could carry on with the windows message handling and sound/input processing in thread 2 while thread 1 was still rendering the previous frame.

Or

I could split the GUI_Tile::Draw() call into GUI_Tile::PreDraw() and GUI_Tile::ReallyDraw(). If all of the stuff in the PreDraw() is self contained or ‘read-only’ (in other words that function never changes the value of any data outside its own object), then I can run a whole bunch of those at the same time. I can keep my directx in the main thread, but pass all the tiles in bunches off to 7 other worker threads to do PreDraw(), then when they ALL finish, I can just quickly loop through the Draw() calls, which are way faster.

Or

I could not do a lot of this transform stuff or drawing stuff each frame anyway. I could use a dirty-rects system, where when I render a frame I keep a copy of the whole thing in an offscreen buffer. I then only bother calling Draw() on those tiles which I know have actually changed or done anything. That way I’m massively cutting down on the number of Draw() calls I make, and I can keep the entire thing in one thread and thus way simpler.

Or…

etc.

The real thing you need to learn to be a decent game developer, is which one of those solutions will work best for this game, with this hardware, with this API, with this data. Coding any one of those solutions takes TIME. Debugging them takes more time, and profiling and analysing to see which works best takes more. Maybe having 8 threads all running is actually slower than just one (this can happen if you do it badly). Maybe a dirty rects system falls apart dramatically when zooming and scrolling. Maybe assuming PreDraw() is a nice self contained function is a bug-prone disaster waiting to happen in obscure and impossible-to-debug ways.

This is the real hard stuff, and the main reason its real hard is that its very difficult to get on top of this sort of thing until you have done it a LOT. If you have programmed, for example, a sound engine…well done. Take a bow, its honestly not that easy, and I bet you learned a lot. You have programmed your FIRST sound engine. Now delete all the code and program it another way entirely. then again, then again, then again. NOW, you know how to program a sound engine. you know why method A is slow, why method B is buggy, why method C is hard to maintain, why method D doesn’t scale, and method E Is awesome, but only on Playstation 4.

This is why the top jobs often to go to coders with experience, and why the sheer amount of code you have written and the number of hours you have been typing code really does matter. Being a good programmer is rarely about those things that are often bandied around like ‘always comment your code’ or ‘use descriptive source control commit descriptions’ or ‘use this syntax in your code’. Its mostly about the big ‘code architecture’ decisions. The way you lay out your code, the overall designs, the thinking behind how the whole process snaps together.

The most efficient coding you ever do is probably with a pen and paper, or a big chalkboard on the wall. When you have done this a LOT, you can do that in your head, but that takes decades. The more you code, the better you get.