Category Archives: programming

Drawing a LOT of sprites

August 29, 2016 | Filed under: programming

I’m doing early work on my next game, a completely new IP. I’ll be announcing it in a few months. Anyway… it involves drawing a big world with a LOT of objects in it. tens of thousands on screen probably. Drawing 10,000 objects in 2D is not as simple as you think.

If you are a non coder, or someone who only ever uses middleware, you might think ‘the new video cards can draw 10,000,000 polys per frame, what’s the problem? and indeed there is *no problem* if you want to set a single texture and then splat 5 million sprites on the screen that show it. Thats one (well…probably several) draw call, one render state, one texture. Even really old video cards like that.

The problem is when you have a lot of different textures and want to swap between them, because for engine-related reasons, you need to draw stuff in a specific order. In a 3D world, you can use a Z-buffer and draw out of order, but with 2D objects with some soft aliased edges, that looks BAD. The good old fashioned painters-algorithm is your friend. The problem is, if you draw back to front and the sprite textures needed go A B A B A B, you are kinda fucked…that means a lot of texture changes, and in directx9 (which I use for compatibility reasons), texture changes mean separate draw calls, which stalls the video cards, and is sloooowwwww.

Relevant video from GSB2:

So what are the workarounds?

Texture atlases. This is the obvious one. Stick A & B in the same texture, and you are laughing, suddenly stuff is a LOT quicker. This only solves the texture issue, not drawing to different render targets, but you can defer those draws anyway and do them separately (GSB 2 does this). Texture atlases are an obvious ‘win’ even if they only halve the texture changes. The problems here are that you either need to know what textures will follow each other and pre-compile texture atlases (something I’m trying right now), or you need to dynamically create texture atlases based on recent drawing, and effectively use an off-screen render target as a texture ‘cache’. I tried this recently…and it was actually slower :(

Dirty-rects. Basically draw the whole scene once, and save it in an offscreen buffer, and use it as your background, a single quad blaps the whole screen, and you only draw stuff that has changed / is animating. This, as I recall was used by sim city 4. The only problem is that scrolling really causes hell.

Intelligent grouping. The painters algorithm is only really needed where stuff overlaps. if I draw a tile, then draw a sprite on top of it, I need the tile first, but there is no reason why I can’t draw all the tiles first, then the contents. That means I can then sort the tiles by texture and draw them in a handful of calls (or one, if the tiles all fit into an atlas). You can do this at pretty much any level, effectively drawing ‘out-of-order’ but with caveats. Again, GSB2 does this, with various objects, especially debris and asteroids. In fact it goes one stage further by scanning ahead with each draw call to see if some non conflicting later objects could be ‘collapsed’ into the current draw call.

Multi-threading and other speed boosts. If you have too many draw calls and things are too slow, then you can expand on the time available to make draw calls. Essentially you have two threads, one which prepares everything to be drawn, and the draw-call thread, which makes all your directx calls. This way they both run in parallel (also note that the directx runtime will be another thread, and the video card driver another one, so you have 2 threads less than you think. With a hyperthreaded 4 core chip, you have 8 truly simultaneous threads, so you give away 2, have 1 core thread, 1 render thread and 2 extra ‘worker threads’ spare. Because of my own disorganisation, I tend to have directx called from my main thread, which means I do the inverse. GSB2 did this, with all of the transformation stuff for the asteroids, debris and other bits and pieces handed to a bunch of threads while I was busy with other stuff, then returning to the main thread to present the draw calls. Less efficient, but way better than single threading.

Hybrids. All of the above techniques seem valid to me. Although I am currently fixated on pre-compiled texture atlases, I’ll definitely use multithreading and probably some of the others. With some parts of a game, a specific optimisation system may work well, and with others it could be useless. It really is specific to what you draw, and is why I prefer a hand crafted engine to a generic one.

My basic problem is that (without explaining what the game is), I have a small number of ‘components’ that make up a tiny scene on a tile. There will be a lot of components per tile, as I want a fairly ‘busy’ look, but rendering them all individually may be ‘too much’. What I may end up doing is pre-rendering each conceivable tile as an offline-step, to reduce the number of calls. I’d like that to be a last minute thing though, so I can keep editing what the scenes look like. IU also want sections of each tile to animate, or be editable and customizable, which means there is less scope to pre-render them.

It will make a lot more sense once I announce the game :D

When I first started coding, especially when I first did C++, there was a lot of confusion about where exactly the ‘game’ was, and more specifically, what a main game loop looks like. Because these days, all you hip kids code in Unity or Gamemaker or MyFirstIDE or some other colorful beginners IDE for the under 5s*, you never actually see your game loop, and have no idea where it is, let alone what it does., However, us real-men who code from the ground up have to write one. Here is mine from the PC version of Democracy 3. There are few comments, because in my godlike knowledge, I understand it all at a glance.

======================================================================================
void APP_Game::GameProc()
{
    HRESULT result = GetD3DEngine()->GetDevice()->TestCooperativeLevel();
    if(FAILED(result))
    {
        if (result == D3DERR_DEVICELOST )
        {
            Sleep( 50 );
            return;
        }
        else if(result == D3DERR_DEVICENOTRESET)
        {
            if (!GetD3DEngine()->Restore())
            {
                // Device is lost still
                Sleep( 50 );
                return;

            }
            Restore3D();
        }
        else
        {
            Sleep( 50 );
            return;
        }
    }

    GTimer looptimer;
    looptimer.StartTimer();
#ifndef _GOG_
    GetSteam()->Process();
#endif //_GOG_
    GetInput()->Process();
    GUI_GetCursor()->SetCursorStyle(GUI_Cursor::DEFAULT); //reset each frame...
    if(BActive)
    {

        GUI_GetMusic()->Process();
        GUI_GetSounds()->Process();
        if(PCurrentMode)
        {
            PCurrentMode->ProcessInput();
        }
        SIM_GetThreadManager()->ProcessFrame();
        GetTextureHistory()->Reset();
        LOCKRENDERTHREAD;

        GetD3DEngine()->BeginRender();
        if(PCurrentMode)
        {
            PCurrentMode->Draw();
        }
    
        GUI_GetTransition()->Draw();


        GetD3DEngine()->EndRender();

        RenderFont();
#ifdef _DEBUG
        if(GetInput()->KeyDown(VK_LSHIFT))
        {
            GetTextureHistory()->Draw();
        }
#endif //_DEBUG
        GetD3DEngine()->Flip();    

        RELEASERENDERTHREAD;

        looptimer.Update();
        if(looptimer.GetElapsed() < 16.0f)
        {
            looptimer.Update();
            if(looptimer.GetElapsed() < 16.0f)
            {
                Sleep(0);
            }
        }
    }
    else
    {
        ReleaseResources();
    }

    if(BRestartPending)
    {
        BRestartPending = false;
        SilentRestart();
    }
}

======================================================================================

*yeah I’m mocking different IDEs. deal with it :D This is sarcasm. There is no proven link between masculinity and choice of game development environment.**

**yet.

The problem with games like Democracy 3, or indeed Gratuitous Space Battles, is that they are VERY hard to balance. Balancing a game is exceedingly hard, even if you agree, like me, that ‘some’ inbalance is actually what can make a game fun. (If all choices are equally effective, why pretend there is strategy to the game?). DFor big AAA studios with huge testing teams, pre-release balance is easier. For one or two man indie studios, its almost impossible to do in house.

The common solution is early-access / pre-orders with beat access, which get a lot of people to play the game. However this assumes that

a) Those players are representative in terms of skill, time, and play style to the wider public

b) You get decent feedback from all those players, not just the hardcore and

c) The feedback from the players is impartial and can be relied upon.

With a game like Democracy 3, c) is a real issue. A lot of people complain that capitalists are hard to please, but how do we know if that’s true, or if its feedback from socialist players? Anyway, as a geek, and a coder, and someone who embraces complexity, an alternative strategy of letting the game balance itself obviously appeals. So clearly, getting the game to play itself constantly and discover, then report, and maybe even auto-tweak and adjust its own rules, is the ultimate goal.

This would be an ideal ‘summer vacation’ project for me.

Actually, the bit most people are probably wary of (coding the bit where the game comes up with a strategy, plays the game and developers improvements to its play style based on the results), isn’t the bit I’m worried about coding. That bit wont be too bad (i’m not aiming for alphago levels of skill here), the bit that makes me roll my eyes and go ‘Can I be arsed?’ is the mechanics of introducing a wrapper to the game that lets it play itself. The problem here is just doing a nice, clean job of introducing such a system without breaking the game or introducing anomalies.

This is where Alphago is AMAZING, because it simply uses visual input. Frankly that ain’t gonna happen, so I’d have to embed code in there…or would I? an ideal system would be hands off, a new ‘GeneticD3.exe’ that launches D3, non invasively, makes player decisions, then handles everything without touching any D3 code directly… Surely that cannot be done?

Mwahahahaha. It possibly can…almost, because D3 is already set up to autosave each turn. And because its turn based, all the ‘decisions’ happen at once. So all I really need to do is to write code that parses the D3 save game (handily already xml), makes decisions as to what to do, runs those decisions through the D3 engine (that bit gets tricky), and then dumps out the next turns save game, repeat ad finitum. To do some rough maths, lets say saving and loading D3 takes about 2 seconds (I rule at optimizing), and we give a generous 2 seconds to make a turns decisions, thats 4 seconds a turn, 64 seconds per term in office, or a 4 term complete game in 256 seconds, so roughly 14 games per hour, or 300 games a day. Thats actually remarkable slow. If AlphaGo needed 100,000 games to learn, I’d need the best part of a year. Still, assume some cunning multithreading and optimizing (I don’t ever need to touch the disk really, all save/loads could be trapped in a RAM buffer), I could probably quadruple the speed, so 1,200 games a day?

I am seriously thinking about it. It would definitely be fun.

I’ve had problems for at least a few years when it comes to coding purely for fun. Thats not to say I do not enjoy coding, I LOVE coding, its my passion, but I made the mistake (in some ways) of turning my passion into my job, and then my career and my whole livelihood & retirement plan, and when you do that, suddenly when you are writing code you have a little voice at the back of your head saying “who is going to buy this”. And thats not a problem, in fact its a GOOD thing because it means you release commercial games and not arty self-indulgent bullshit about crying and existential angst among cartoon Bolivian hamster-weaving. Thats how I’ve stayed in business.

But the problem with working for yourself, at home, when you are the boss, is that you can work WHENEVER you like, and this means the line between working and having fun gets not some much blurred but obliterated.

beach

If I decide to take a day off work (madness!), I really can’t go near a PC, because the PC is where I work, and my office is for work, not for fun. Its hard enough to sit here at this desk and play games instead of work, but if visual studio is open then I am IN WORK MODE. My Brain goers all serious and strategic and long term.

So I’m trying to shake myself out of that and re-discover the joy of pure creation as a hobby, as fun, as something experimental and silly, and not something that I expect to ever charge money for. I will probably never get around to achieving anything, and certainly not making anything public (unless miraculously I make something I’m not ashamed of). The main goal is going to be to learn how to code some stuff without getting all world-domination and work-ethic about it.

I know a lot of people do one-game-a-month stuff and game-jams, but thats just not my scene. I prefer to code for fun, than design for fun. Design is too intense for me. I can practically code while I’m asleep. Ask anyone whose tried to use my code :D.

I was never 100% happy with the radiation effect in Gratuitous Space Battles 2, so I’ve coded a better version for the next patch. Here it is in very short silent video form:

I thought maybe some people may be interested theoretically in how it is done. Now I’m sure if this was in 3D in unity there is already a 99c plug-in that does it without you even having to understand coding, but thats not how its done so here we go…

The first thing to remember is that the spaceship is not a 3D mesh at all. its a series of sprites rendered to different render targets and then composited. That means that I’m not wrapping a texture around a mesh, but drawing one sprite on top of another, but cropping the final output to the alpha channel of the base (ship outline) sprite.

Before I learned much about shaders, I had a system that did this using mostly fixed function, which I use here, but build on with better shaders to do more stuff, making it a hybrid approach and fairly complex :D. To simply draw one sprite on top of another, but use the alpha channel from the bottom sprite, I use a custom vertex format that has 2 sets of texture co-ordinates. One set is from the splatted image ‘radiation’ and the other is from the alpha channel of the ship outline.

It gets a bit fiddly because where I’m drawing the splat sprite could be anywhere on the ship, so I need to work out the offset and dimensions of the splat in relation to the ship image, and store that in the second set of texture UVs, and pass that into a shader.

The shader then simply reads the color data and alpha data from the radiation splat, and multiplies the alpha by the alpha at that point on the base ship texture, and voila, the image is cropped to the ship.

But thats a very simplistic explanation because there is more going on. To make things fast, all of my radiation gets drawn together (to minimize texture / render target / state change hassle), so its done towards the end of processing. In other words, there is no painters-algorithm going on and I need to check my depth buffer against this ships Z value to see if the whole splat is obscured by some nearby asteroid or debris. That requires me passing in the screen dimensions to the shader, and also setting the depth buffer as a texture that the shader can read.

So that gets us a nicely Z-cropped image that wraps around the ship at any position. It also fades in and out so we need to pass in a diffuse color for the splat sprite and calculate that too. We also have to do all this twice, once for the splat, and once for the slightly brighter and more intense lightmap version, allowing the radiation effect to slightly light up areas of the ship hull during final composition.

At this point the shader is a bit complex…and there is more…

I wanted the radiation to ‘spread’, and to do that, I need to splat the whole thing at a fixed size, but gently reveal it expanding outwards. To do this I create yet another texture (the spread mask) which also gets passed to the shader. There were various ways to achieve this next bit, but what I did was to ‘lie’ to the shader about the current UV positions when sampling this spread mask. Basically I calculate and pass in a ‘progress’ value for the spread, and I use that, inverted, to deflate the UVs of the source image (which is CLAMPed). So effectively my first UVs are -4,-4,4,4 and the spread splat is a small circle in the center of the splat, expanding outwards to full size.

Because I’m only doing this when I sample the alpha channel from the spread mask, the color and alpha data from the base ‘splat’ texture remains where it is, so it looks like its a static image being revealed by an expanding circular (but gaussian blurred) mask.

I’m pretty pleased with it. The tough bit I’ll leave for you to work out is how that works when the source radiation image is actually a texture atlas of different radiation splats :D

Fun fun fun.