Game Design, Programming and running a one-man games business…

Supporting modded content

Supporting mods is normally pretty easy, especially if you have your data left pretty open. The place it gets tricky for GSB2 is online challenges. People post their fleet to other players as a challenge, and this works great when the only content in your challenge is content you *know* the other player has on their hard drive, but the minute you allow modding it gets kinda complex.

It’s pretty late in GSB2’s development for me to realize (less than two weeks from release) that the way I was handling it was not actually working, so yesterday I had to go back to the drawing board, and I have high hopes it will be cracked by the end of the day, especially regarding the most likely form of modding, which is extra hulls, modules and ship components (visuals).

When people posted a challenge in GSB1, it was just a binary GSB file. I had a really rubbish system where the player had to tick boxes when issuing a challenge saying what extra DLC content might be included. In fairness, the game did actually check on launching a challenge and warn people if they needed extra content, but it really did kinda suck.

With GSB2 I’m improving this. I hacked the mod-content code last night so it told each piece of moddable content (module/component/hull) what mod is came with. The base game has been designated as just another mod, which makes this nice and easy. Then, when I post a challenge, the game can scan through every ships hull, module and component and make up a list of all the content packs required, which in 99% of cases will be just the base game. In then sticks that in the header file for the challenge.

Theoretically I can parse that file on the server and store it in a database, and thus show a player what content requires mods and what doesn’t. Also theoretically I can direct them to the download page for that mod. In an ideal world, I’d break mods apart automatically upon submission and handle the file delivery along with the challenge, so you would magically get any extra required content. The bandwidth requirements there might be a pain, and this is all fantasy work for after release.

But I am at least confident that I will release with a system that at least lets people put together mods, and use them within challenges without any confusion or random crashing. Worst case situation is a popup on a challenge saying “sorry, this requires the ‘l33t ship hulls mod’ and you don’t have it yet!”. That will just be phase one.

GSB2 had a superb modding scene, and I want to be supporting that in GSB2 from day one. I suspect the ship design steam workshop submission stuff will help get people interested in the mod scene better, and the integration of a mod control panel will also make mod management a lot easier.

 

 

Gratuitous Space Battles 2 with Graphics Debugging!

So here is something you might enjoy, especially if you like gratuitous charts and stats porn, or are interested in graphics programming, or maybe you just want to tell me I’m doing it wrong. This is a video of me demonstrating nvidia nsight, and how I use it to spot things that are inefficient in my engine for Gratuitous Space Battles 2. I’ve already fixed the two inefficiencies I point out in the video, while I was waiting for it to upload :D Enjoy! (And please share!)

multithreading sound engine bug…

I have a bug thats driving me nuts. I use some middleware as a sound engine. its the ONLY middleware I use, and its bugging me. theoretically its easy to use, but I have a situation that it seems incapable of coping with.

With this middleware, I can play a sound, and request a pointer to track it. I can use that pointer later to adjust volume, or stop the sound, or query if its finished. For various reasons, I need to keep my own list of what sounds are currently playing. Thus I have a list of ‘current playing sounds’.

The middleware gives me a callback which triggers when a sound ends. This is handy, as I can then loop through the current playing sounds and remove it, keeping that list up to date. The sound engine runs in its own thread, so that callback triggers in a different thread to the main game.

This is where it goes wrong (but only on fast speed). I decide from the main game, to stop a sound. I firstly check that the sound exists within the current playing sounds. It does, so I access the sound pointer and tell it to stop. But wait! in-between those two events, the sound has expired naturally (in another thread) and the pointer has become invalid. CRASH.

using critical sections just produces race conditions, because stopping the sound has to happen in the same thread as the callback, and there are likely several sounds generating callbacks in the same frame (on fast speed) as the one I’m trying to stop, and it reaches a deadlock. It’s a real pain.

One solution is to make all such sounds loop (and thus never expire naturally, and rely on me killing them, which should work ok) and I thus never hit this problem. Another is to just not stop them prematurely (looks weird). I have currently hacked it, but I suspect the 1.18 build still has this issue manifesting itself as a 4x speed lots of beam-lasers crash.

Another solution is to tell the sound engine to run single threaded but that seems horrendously hacky.

I may have to try the always-loop solution. One day I’ll write my own sound engine again.

 

Draw list sorting and concurrency issues

Background: I use directx9 to develop Gratuitous Space Battles 2 using my own engine.

I’ve been doing my best to reduce the number of draw calls per frame for complex scenes in GSB2. Basically I have a lot of stuff with different textures and render states, and they are being drawn from front to back rather than z-buffer sorted (for reasons concerning sprites & high quality alpha blended edges).  What this means is, when you have 16 identical laser beam turrets, you may not be able to draw them as a single batch, because in-between them you might be drawing other stuff. As a result you get 16 draws instead of one. Ouch. That causes driver slowdown, directx slowdown, and inefficiency on the GPU, which prefers big batches.

Of course you can immediately see that it would be fine to go through the draw list (one of several actually, for composition reasons), and spot all those cases where you have 2 or more turrets (or any sprite) that use the same texture and which do NOT have anything in between them that overlaps the first one, and grab that second turret and draw it ‘early’ with the other one. And in fact, that works just fine. suddenly lots of draw calls get optimized away! (the green ones) In this case out  1,159 draw calls 713 get optimized away into batches.

megabatchedThere is an immediate problem though. This is extremely slow, even with every optimization trick in the book. Assume you have a list of 1,000 objects to draw (not inconceivable for big battles). In the worst case situation, that means comparing object 1 to 999 different others and doing a bounds check. Then object 2 gets compared to 998, then 3 to 997, and so on. That is a LOT of function calls (inlineable I know…), and a lot of bounding box comparisons, and a lot of texture comparisons (only a pointer compare, but I need to extract that texture pointer from each renderable object, and at 500,000 de-references per frame even that adds up.

Now granted this is all total worst case. Some of those 1,000 objects aren’t batchable, some of them *will* quit early as something overlaps, and when I do batch future ones with early ones, those future ones themselves don’t need to be checked as they have been optimized away.

The trouble is, after profiling, it is still about 70% slower for the CPU to do this, than not do it. The big problem here is that I’m making stuff light for the GPU, heavy on the CPU. Is that a good idea? maybe… But as it happens, if I assumed a dual core (or better) PC, it doesn’t matter because it is FREE. I have other threads just sat there. If I find a slot in my main loop between building the draw list and having to actually render from it, I can multi-thread that new slower batching code and actually have the whole app run faster, thanks to less draw calls later on. The Visual C++ concurrency profiler shows it works: (Click to enlarge)

visual_c++_concurrency_game

Previously I’d have gone through and built up the draw list (thats the blue), then done some particle drawing preparation(green), then batched my draw calls (purple) and then drawn everything (yellow). Because the particle stuff actually gets put into a different list, I can mess around with my slow batching in a new thread while the main thread prepares particle stuff. Hence the purple bar is now on a new thread and works alongside the main one. As it happens I also have 2 more threads doing some particle emitter stuff at the same time as well, so briefly I’m at 100% utilization on 4 threads, possibly 4 cores (other processes such as the music streaming/driver might be on one of them).

So in a sense, yay! faster code, but what a nightmare to measure. It depends on scene complexity, relative CPU/GPU speed, number of cores and god knows what else. However it is worth remembering that sometimes slower code will make your game run faster, it just depends where that slow code runs.

Optimizing my gratuitous GUI

If you’ve watched high-def videos of Gratuitous Space Battles 2, or been lucky enough to try it at EGX, then you may have noticed all that gratuitous GUI fluff that animates and tweaks and flickers all over the place, because… frankly, I like that kinda nonsense, and it’s called GRATUITOUS, so I can get away with it. This sort of stuff…

widgets

Anyway…I love it,and it’s cool, but until today it was coded really badly. I had a bunch of helper functions to create these widgets, and I sprinkled them all over the GUI for fun. For example there was a function to add an animated horizontal bar that goes up and down. Wahey. Also there are functions to add random text boxes of stuff. The big problem is that each one was an isolated little widget that did it’s own thing. In the case of a simple progress bar widget, it would have a rectangle, and a flat-textured shaded box inside it that would animate. That meant 2 draw calls, one for the outline of the box (using a linelist) and the other was a 2-triangle trainglestrip which was the box inside it. That was 2 draw calls for a single animated progress bar thing, and a single GUI window might have 6 or even 20 widgets like that… so suddenly just adding a dialog box means an additional 40 draw calls.

Normally that doesn’t matter because a) 40 draw calls isn’t a lot, and b) graphics cards can handle it. However, it’s come to my attention that on some Intel integrated cards, which are actually surprisingly good at fill rate and general poly-drawing, too many draw calls really pisses them off, performance wise. Plus… 40 draw calls isn’t a lot, if thats your ‘thing’, but if there are 40 on the minimap, 40 on each score indicator, 40 on the comms readout, 40 on each of 3 ship inspector windows, then suddenly you have several hundred draw calls of GUI fluff, before you do the actual real GUI, let alone the big super-complex silly space battle, and yup…I’ve seen 4,000 draw calls in a frame. Ooops. To illustrate this, here is that top bunch of widgets in wireframe.

wire

That’s a lot of stuff being drawn just for fluff, so to ease the burden on lesser cards, I should be batching it all, and now I am. That used to be about ten trillion draw calls and now its about five. I have a new class which acts as a collection of all the widgets on a certain Z-level, and it goes through drawing each ‘type’ of them as it’s own list. Nothing actually draws itself any more, it just copies it’s verts to the global vertex buffer, and then when I need to, I actually do a DrawIndexedPrimitiveVB() call with all of them in one go.

Ironically, this all involves MORE verts than before, because whereas drawing a 12 pixel rectangle with a line list involves 4 verts, drawing it as a trianglelist uses loads more, but I’m betting (and it’s a very educated bet) that adding the odd dozen verts is totally and utterly offset by doing far, far fewer draw calls.

This is how I spend Sunday Afternoons when it’s too cold for archery…