Game Design, Programming and running a one-man games business…

Gratuitous Space Battles 2 Ship Customizer! Oh yes…

Behold the latest video. I think you will like this one. Plus, this is with my spangly new (pricey) microphone. can you tell the difference? Not sure I can, but let me know if it’s any better!

I’m sure people will have some questions. If you don’t see this is a big cool new feature that totally changes GSB, then watch it again :D It’s going to be awesome fun seeing what people can do, especially given how crazy people went with mods for GSB1 and that was without any ability to change graphics within the game. Let me know what you think…

Gratuitous Space Battles 2 in multi-monitor mode!

At last a shaky-cam (well not shaky, but you know what I mean) video of GSB 2! I wanted to do this to show off multiple monitor mode with a lemon for scale. The video shows my dev PC with the game running. My PC is a i7 3770 quad-core 8gig RAM, windows 7 and a GeForce GTX670 video card, powering two 27″ monitors for a total GSB2 fun ratio of 5120×1440, or other 7 million pixels of lasers and explosions. Here is the video:

I’ll be doing more videos over the next few months to keep you all updated, plus other things are in the pipeline :D. In future I’ll capture normal in-game footage I just wanted to do a multi-monitor one :D Help me spread the word about 7 million pixels of explosions with ‘likes’ and ‘shares’. I reckon I’ll be more popular than these youtube kids by tomorrow!

BTW the games current website is at www.gratuitousspacebattles2.com (it will get a makeover eventually), I blog about the game here, occasionally tweet about it (@cliffski) and there are forum discussions here.

 

Thoughts on multi-threaded 2D game development in directx9

Am I the only person doing this? probably. I often am. Most people have moved on from DX9 (I know it so well there is big opportunity cost to updating) or use OpenGL, and very few people are doing 2D games where performance is an issue. I am taking early steps with Gratuitous Space Battles 2, and my aim is to have it run at 60 FPS on average hardware with 2 1920×1080 monitors. I also intend to get it running ok for bigger setups too. That’s a lot of pixels, and due to all sorts of fancyness I’m adding to GSB 2.0, it means a lot of processing. a REAL lot.

So…multithreading! it’s about time i ventured forth. To date, my only multi-threading efforts have been the asynch server communication in GSB 1.0 for challenge uploads etc, and the loading screen for GTB and Democracy 3. Actual mid-game multihthreading has scared me until now.

I hate middleware so I’m not using any libraries, just raw calls to CreateThread, TerminateThread and so on… This might make it more complex, but means I have complete control over stuff. My first experiments were not exactly encouraging. I attempted to speed up the position calculations of asteroids. Now to cut a long story short, I use D3DTLVERTEX style stuff (not hardware Transform and lighting) and for good reason i won’t bore you with. The upshot is, I have a lot of non-directx transform stuff to do for anything drawn on the screen.

An ideal case for multithreading!

threads--friend--grey-cat--thread_3218279

So I wrote code to split up the asteroids into 8 chunks (test case of an 8 core chip), and gave each processor a list of asteroids to process. Result? SLOWER. Actually quite a bit slower. Some fiddling with AQTime (My profiler) let me analyze cache misses for each thread, and I also profiled it as 1 thread. The cache miss rate went through the roof. Basically, my transform code was relying on some global camera data, and I suspect that either:

a) Referencing he camera data was a bottleneck with each thread blocking each other from getting it or…

b) The memory locations of the asteroid transform data was laid out in such a way that all the different threads kept fighting for the same cache lines and generally getting in each others way.

I spent a lot of time reading and fiddling and decided that it wasn’t working (although did manage a decent few speedups in other ways). I then decided that if lots of threads sharing the same job wasn’t going to help, maybe lots of threads doing different (unrelated) jobs would…?

And this is more of a success. I have a function called ProcessFrame() which does a lot of non-directx stuff, such as the aforementioned asteroid transforming, updating engine glows, updating explosion plumes, particle effects and distortion waves blah blah… Until recently, it just did them one after the other. I then realized that although a lot of them accessed the same data (camera position stuff mostly), none of them altered it, and the tasks were quite discrete.  So I packaged them up and sent them to different threads, and then spun in the main thread waiting for them to finish. result? 21% faster. yay? not bad, but not 800% faster, which would have been theoretically do-able(not really but…)

Of course the missing link was that I am then left waiting for the slowest thread. Plus if I have more than 8 tasks, I run out of CPUs. So I re-coded it to have a queue of tasks, and when a thread finished a task, it checked the queue, and only reported it was done when the queue was empty. This was way more efficient, and easier to scale to available cores. Result? 41% faster!

Now obviously a 41% processing speedup is good (although this is pre-render, not render, so probably only a 20% FPS boost) but I can’t help thinking that if not 800%, a 200% speedup of that bit of code must be possible. Debugging cache-misses is hard, as even aqtime will bluescreen occasionally on windows 7 when profiling it. I’m pretty sure it’s some cache, false-sharing issue going on.

In the meantime, GSB is now faster as a result, even if i spend no more time attempting to multithread it (and i will… I’ve only just got going). Anyone else attempting this sort of thing?